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ABSTRACT 


I 

Problems  of  Information  processing  and  optimal  surveillance  in  a false  target 
environment  are  Investigated  with  A5W  applications  In  view.  The  Information 
processing  procedures,  among  other  things,  make  use  of  adaptive  estimation 
techniques  in  order  to  identify  uncertain  system  parameters.  Procedures  are 
presented  for  computing  real-time  estimates  of  the  target  looation  probability 
distribution  In  realistic  tactical  scenarios  involving  moving  targets  and  false  sensor 
responses.  The  procedures  are  applied  to  a variety  of  Illustrative  examples  per- 
taining to  the  processing  of  responses  obtained  from  a fixed  sensor  field  in  barrier 
and  area  surveillance  scenarios. 


The  optimal  allocation  of  ASW  search  resouroeB  In  a false  target  environment 
Is  investigated  In  an  exploratory  analysis  of  an  Idealized  surveillance  situation. 
Several  allocation  policies  are  formulated  Including  one  based  upon  some  conoeptB 
of  information  theory.  This  "maximum  information  gain"  policy  Ib  shown  by 
numerical  examples  to  have  very  desirable  characteristics,  fa  order  to  further 
establish  the  relevanoe  of  the  Information-theoretlo  approach  to  the  surveillance 
problem,  the  latter  Is  formulated  as  a type  of  sequential  statistical  experimental 
design  problem  whloh  has  been  studied  extensively  using  Information-theoretic 
conoepts. 
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This  ia  a report  to  Naval  Analysis  Programs,  Office  of  Naval  Research 
(Code  431),  undor  Contract  No.  N00014-71-C -0309.  It  presents  methods  of 
processing  Information  from  ASW  seniors  In  tha  presence  of  false  targets  and 
for  planning  surveillance  actions  based  on  such  processing.  The  methods  are 
presented  in  ways  that  are  suitable  for  real-time  computer  assistance  to  ASW 
surveillance  operations  and  have  In  fact  been  motivated  by  actual  applications  of 
this  nature.  Related  prior  applications  have  also  Included  computerized  assistance 
to  search  and  rescue  operations  by  the  Coast  Guard. 
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summary 


This  report  addresses  problems  pertaining  to  ASW  information  processing 
and  optimal  surveillance  in  a false  target  environment.  The  objective  is  to 
provide  useful  concepts  and  practical  approaches  for  answering  the  question, 
"Where  is  the  target?".  Potential  applications  Include  continuous  broad 
localization  of  the  target  through  intermittent  application  of  ASW  search  at 
selected  times  and  places.  No  ASW  action  other  than  target  location  is  considered. 

The  results  on  ASW  information  processing  are  given  in  Chapter  II,  and  the 
results  on  optimal  surveillance  are  given  in  Chapter  III.  Chapter  I provides  a 
brief  Introduction,  and  Appendices  A and  B support  the  material  presented  in 
Chapters  II  and  III. 

The  first  two  sections  of  this  summary  discuss  Chapters  II  and  m.  The 
third  section  discusses  the  appendices. 


ASW  Information  Processing 

Chapter  II  presents  methods  for  processing  ASW  information  in  order  to 
compute  real-time  estimates  of  the  target  location  probability  distribution.  An 
illustrative  ASW  setting  is  used  to  demonstrate  the  potential  applications  of  the 
processing  concepts,  and  extensive  numerical  results  are  given.  The  method- 
ology is  discussed  in  computer  programming  oriented  language  in  the  final 
section  of  Chapter  II  and  in  greater  generality  in  Appendix  A. 

The  target  location  probability  distributions  are  computed  by  monte-carlo 
simulation  and  are  expressed  discretely  In  terms  of  grid  cell  probabilities.  It 
is  assumed  for  illustrative  purposes  in  Chapter  II  that  a fixed  sensor  field 
provides  the  source  of  real-time  input  to  the  processing  system. 

The  term  sensor  response  is  used  to  Indicate  that  a decision  has  been  made 
that  the  sensor  output  contains  a sufficient  number  of  target- related  cues  so  that 
the  hypothesis  that  the  target  is  present  is  preferred  to  the  alternative  hypothesis 
that  the  target  is  not  present.  A false  response  Is  a response  generated  by  a 
non-target-related  mechanism.  The  causes  of  false  responses  are  dealt  with 
from  a decision-theoretic  point  of  view  in  a predecessor  report  (reference  [ a )); 


v 


the  present  report  adopts  an  operational  point  of  view  and  focuses  on  overcoming 
the  adverse  effects  of  false  targets. 

The  methods  of  Chapter  II  and  Appendix  A have  been  applied  without  falBe 
target  considerations  in  Coast  Guard  search  and  rescue  (SAR)  cases  (see 
reference  ( b ])  and  In  certain  A3W  situations.  In  each  instance,  successful 
implementation  of  the  methods  has  depended  upon  exploitation  of  the  unique 
aspects  of  each  application  and  construction  of  a mathematical  model  having  a 
level  of  detail  and  realism  consistent  with  the  quality  of  the  data  and  with  the 
constraints  imposed  by  computer  memory  size  and  computation  speed.  In  view 
of  this,  the  results  in  this  report  are  intended  to  point  the  way  rather  than  to 
give  a comprehensive  treatment  which  would  cover  all  possible  circumstances. 

Among  other  things,  Chapter  II  shows  how  to  make  use  of  adaptive  estimation 
techniques  (see  reference  (c  ])  in  order  to  identify  uncertain  system  parameters. 
In  a sense,  one  begins  with  a family  of  models  and  the  "correct"  model  Is 
identified  adaptively  on  the  basis  of  observational  information. 

For  example,  the  particular  stochastic  process  underlying  target  motion  is 
not  assumed  to  be  known.  Rather,  several  possible  processes  ("scenarios")  are 
postulated  and  each  one  is  given  an  a priori  probability  ("credence").  The 
processing  system  revises  the  credences  in  accordance  with  the  input  sensor 
responses.  Those  scenarios  whioh  are  most  in  agreoment  with  the  sensor 
responses  eventually  develop  the  highest  credences. 

Similarly,  the  single-sensor,  single -glimpse  probabilities  of  detection  and 
false  alarm  are  treated  as  unknown  parameters.  They  are,  howover,  related 
through  a known  ROC  relationship.  The  probability  of  detection  is  initially 
assumed  to  be  a random  variable  with  a uniform  distribution  between  known  limits 
and  the  processing  system  adaptively  revises  this  distribution  in  accordance  with 
the  sensor  responses. 

Table  S-l  indicates  illustrative  results  of  the  adaptive  estimation  procedures. 
In  all  cases,  the  scenarios  for  target  motion  are  considered  a priori  to  be  equally 
likely.  The  true  detection  probability  Pd  = . 8 is  not  known;  it  is  assumed  that 
PD  is  a particular  value  of  a random  variable  which  Is  uniformly  distributed 
in  the  interval  from  .5  to  .9.  The  expected  value  of  this  prior  probability 
distribution  is  .7.  The  estimated  detection  probabilities  given  in  Table  S-l 
after  incorporating  sensor  field  responses  are  the  expected  values  of  the 
posterior  distributions  for  f»D.  The  processing  algorithms  make  use  of  the 
entire  distribution  for  P^,  however,  and  not  Just  the  expected  value. 

Table  S-l  (a)  pertains  to  a target  patrolling  station  and  is  based  on  the  results 
shown  in  Table  II— 1 of  Chapter  II,  The  correct  scenario  for  target  motion  In  this 
example  is  Scenario  2,  and  Table  S-l(a)  shows  that  the  credence  associated  with 
this  scenario  rises  to  .95  as  a result  of  processing  all  the  sensor  response 
Information  for  four  field  glimpses.  The  estimated  detection  probability  is  .77, 
compared  to  the  actual  value  of  . 8. 
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TABLE  S~1 


SUMMARY  OF  ADAPTIVE  ESTIMATION  RESULTS 

Notes:  (1)  This  table  indicates  illustrative  results  of  adaptive 

estimation  of  target  scenario  and  detection  probability. 

(2)  The  correct  target  scenario  la  circled,  and  in  all 
cases  the  true  single -sensor,  single-glimpse 
detection  probability  is  PD  = . 8. 


(a)  Target  Patrolling  Station  (see  Table  II- 1) 


Scenario  Credences 


Estimated  Single-Sensor^ 
Single -Glimpse  Detection 
Probability 
(true  value  is  . 8) 


1 

3 

No  Sensor  Information  Used 

.33 

.33 

.34 

.70 

All  Sensor  Information  Used 

(96  hours  into  mission— 4 field  glimpses) 

.00 

.95 

.05 

.77 

(b)  Target  in  Transit  (Bee  Table  H-2) 


Scenario  Credences 


Estimated  Single -Sensor, 
Single -Glimpse  Detection 
Probability 
(true  value  is  , 8) 


© 

© 

3 

4 

5 

No  Sensor  Information  Used 

.2 

.2 

.2 

.2 

.2 

.70 

All  Sensor  Information  Used 

(48  hours  into  mission— 3 field  glimpses) 

.75 

.18 

.01 

.03 

.03 

.73 

All  Sensor  Information  Used 

(96  hours  into  mission— 5 field  glimpses) 

.33 

.60 

.00 

.02 

.05 

.78 

(c)  Target  Out  of  Grid  Area  (see  Table  n-3) 


No  Sensor  Information  Used 

AU  Sensor  Information  Used 

(96  hours  into  mission— 5 field  glimpses) 


Scenario  Credences 
1 2 3 4 ® 

.2  .2  .2  .2  .2 

.02  . 06  . 07  . 08  . 77 


Estimated  Single-Sensor 
Single -Glimpse  Detection 
Probability 
(true  value  is  . 8) 


Table  S-l(b)  pertains  to  a target  in  transit  and  is  based  on  the  results  shown 
in  Table  M-2,  In  this  illustration,  the  target  is  assumed  to  be  a late  Scenario  1 
or  an  early  Scenario  2.  That  is,  the  true  target's  position  falls  midway  between 
the  mean  positions  prescribed  by  Scenarios  1 and  2.  Table  S-l(b)  shows  that  as 
a result  of  processing  five  field  glimpses,  the  total  credence  associated  with  the 
two  closest  scenarios  is  .93  and  the  estimated  detection  probability  is  .78. 

Table  S-l(c)  pertains  to  a target  which  is  out  of  the  grid  area,  that  is,  during 
the  period  of  observation  considered  there  has  been  no  transit  by  the  target  through 
the  area  of  interest.  The  true  scenario  in  this  illustration  is  Scenario  5 and  as  a 
result  of  processing  five  field  glimpses,  the  credence  associated  with  this  scenario 
rises  to  .77.  The  estimated  detection  probability  Is  .83. 

Figure  S-l  shows  selected  probability  distributions  for  the  cases  considered 
in  Table  S-l(a)  and  S-l(b).  It  should  be  noted  that  only  probability  which  falls 
within  the  grid  is  shown  and  thus  the  numbers  need  not  add  to  one.  The 
probability  distributions  on  the  left  are  based  upon  the  a priori  scenario  and  make 
use  of  no  sensor  information.  The  probability  distributions  on  the  right  are 
based  upon  use  of  all  sensor  information  available. 

It  is  evident  from  Figure  S-l  that  processing  of  sensor  response  Information 
by  the  methods  described  in  Chapter  II  and  Appendix  A results  in  considerable 
concentration  of  the  target  location  probability  distribution. 


Optimal  Surveillance 

Chapter  III  is  addressed  to  optimal  surveillance  in  a false  target  environment. 
An  exploratory  analysis  is  presented  for  the  purpose  of  gauging  the  effectiveness 
of  a surveillance  policy  based  upon  maximization  of  the  expected  information  gain 
in  the  target  location  probability  distribution.  Here,  the  term  information  is  used 
in  the  technical  sense  of  communications  theory  (see,  for  example,  reference  (d  ]). 

The  concepts  presented  in  this  chapter  are  expressed  in  terms  of  rather 
Idealized  assumptions  and  further  development  is  required  before  application  can 
be  made  to  large-scale  practical  problems.  The  objective  of  this  chapter  is  to 
demonstrate  through  examples  that  the  concepts  of  information  theory  are  relevant 
to  oertain  kinds  of  search  and  surveillance  problems,  particularly  when  false 
targets  are  considered. 

It  is  assumed  that  the  performance  of  an  ASW  search  system  is  idealized  in 
terms  of  a J X J response  array  (J  is  the  number  of  search  cells), 
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FIGURE  S-l 


THE  INFLUENCE  OF  SENSOR  RESPONSES 
ON  THE  TARGET  LOCATION  PROBABILITY  DISTRIBUTION 

Note:  Only  probability  inside  grid  is  shown  and  thus  numbers  need  not  add  to  one. 

(a)  Target  Patrolling  Station  (See  Figures  II-5  and  II-6) 

No  Sensor  Information  UBed  All  Sensor  Information  Used 

(96  hours  into  mission-- 0 Held  glimpses)  (96  hours  into  mission— 4 field  glimpses; 


■■■■■■ 


C .01  .13  .16  .06 


(b)  Target  In  Transit  (See  Figures  II-9  and  11-10) 

No  Sensor  Information  Used  All  Sensor  Information  Used 

(48  hours  Into  mission— 0 field  glimpses)  (48  hours  into  mission~3  field  glimpses) 


where  R(i,  J)  is  the  probability  that  an  increment  of  search  effort  applied  to  the 
jth  cell  will  result  in  a response  given  that  the  target  iB  located  in  the  ith  cell. 

The  desired  modification  of  the  target  location  probability  distribution  is 
accomplished  by  the  sequential  application  of  search  in  selected  cells.  The 
surveillance  is  carried  out  in  stages,  and  at  the  end  of  each  stage  one  is  required 
to  estimate  which  cell  contains  the  target.  For  all  policies  examined,  the 
selection  rule  at  the  end  of  a stage  is  to  pick  the  cell  having  the  highest  target 
location  probability  distribution  based  upon  evaluation  of  the  search  results. 

The  cell  searched  during  a stage,  however,  may  or  may  not  be  the  highest 
probability  cell  depending  upon  the  policy. 

The  measures  of  effectiveness  are  the  probabilities  S(k)  of  correctly  selecting 

the  cell  containing  the  target  at  the  end  of  the  stage  for  k ~ 1,  2 A 

surveillance  policy  which  maximizes  S(k)  for  some  particular  k is  referred  to  as 
a k-optimal  surveillance  polioy  and  a surveillance  policy  which  maximizes  S(k) 
for  all  k > 1 is  referred  to  as  a uniformly  optimal  surveillance  policy.  Within 
the  framework  of  our  analysis,  k-optimal  policies  are  guaranteed  to  exist  since 
the  set  of  all  possible  polloies  is  finite;  however,  exlste  ice  of  uniformly  optimal 
surveillance  policies  is  not  guaranteed. 

When  target  motion  is  considered,  it  is  assumed  for  illustration  to  be  Markovian. 
This  is  not  essential,  however,  and  target  motion  could  equally  as  well  be  described 
by  the  non-Markovian  processes  considered  in  Chapter  II  and  Appendix  A.  The 
J x J transition  matrix  D for  the  Markov  process  is  assumed  for  illustration  to  be 
given  by,  for  some  0 < 6 < 1, 


! (J-D,  6 6 

i - j 0 * j » • • * * j 


6 , (,T-1)  x 6 <5 

J * J J*  •••*  J 


D = 


a 

v J’ 


I 

i 


This  transition  matrix  depends  upon  a single  parameter  5,  here  referred  to  as  the 
dispersion  constant.  The  Initial  distribution  for  the  process  is  denoted  d. 

The  problem  of  finding  an  optimal  surveillance  policy  can  be  formulated  in 
terms  of  a stocliastic  control  problem,  and  this  is  discussed  briefly  In  Chapter  III. 
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Visualizing  the  problem  in  this  way,  it  seems  apparent  that  k-optimal  plans  may 
be  found  by  dynamic  programming,  but  we  do  not  develop  these  solutions  in  this 
report.  Our  interest  is  in  the  entire  time  behavior  of  the  success  function  S 
rather  than  the  value  of  the  function  at  some  fixed  stage. 

Four  surveillance  policies  are  examined  in  Chapter  III  using  a variety  of 
assumptions  about  the  false  target  environment  and  about  the  prior  target 
location  probability  distribution  and  target  motion  characteristics.  To  do  this, 
let  (for  a given  stage)  PB(j)  be  the  before-search  probability  that  the  target  is 
located  in  the  jth  oell  for  1 < J < J.  Let  pA(r,i, ) ) be  the  conditional  after-search 
probability  that  the  target  is  located  in  the  l*h  cell  given  that  the  cell  was 
searched  and  result  r was  obtained.  Here,  r = 1 indicates  a target-like  response 
and  r - 0 Indicates  a non-target-like  response.  The  four  policies  examined  are 
as  follows: 

I.  The  optimal  single-stage  look-ahead  policy.  The  optimal 
single-stage  look-ahead  policy  is  to  search  in  the  oell  which,  baaed 
upon  the  estimated  vector  Pb,  maximizes  the  probability  of  oorrectly 
selecting  the  target  cell  at  the  end  of  a single  stage.  This  Is  a gener- 
alization of  the  optimal  whereabouts  plan  formulated  In  reference  [ e ] 
for  searches  without  false  responses.  If 


B(J)  = max{PB(i)  R(i,  J)  : 1 < 1 < J}  + max{PB(i)  [1  -R<i,  j)l  s 1 < i < J>  , 


then  It  Is  shown  in  Chapter  IH  that  the  optimal  single-stage  look-ahead 
policy  is  to  search  in  cell  j*  for  which 


B(j*)  > B(j)  for  1 < j < J. 


n.  The  maximum  information-gain  policy.  The  maximum  infor- 
mation-gain policy  is  to  search  in  the  cell  which  maximizes  the  expected 
Information  content  (or,  equivalently,  minimizes  the  expected  entropy) 
of  the  posterior  after-search  target  location  probability  distribution. 

For  any  discrete  probability  distribution  P over  J cells,  the 
entropy  H(P)  Is  defined  by 


J 

H(P)  = - 2 P(J)  In  P(j). 

J=1 


xi 


The  expected  entropy  U(j)  of  the  posterior  target  location  probability 
distribution  given  search  In  cell  j is  shown  in  Chapter  III  to  be  given 
by 


J 

U(j)  = -iSiPBa){R(i,J)hiPA(l,l,j)+(l-R(i,J))lnpA(0,iJ)}. 

The  maximum  information-gain  policy  is  to  search  in  any  cell  J* 
for  which 

U(j*)  < U(J)  for  1 < j < J. 


m.  The  highest  probability  oell  policy.  The  highest  probability 
cell  policy  is  to  search  in  the  cell  with  the  highest  probability,  that  is, 
to  searoh  in  any  oell  j*  for  which 


PBti*)  > Pb<J)  forl<j<  J. 


Once  Pb  if  determined  from  the  search  results  of  the  previous  stage, 
this  plan  does  not  make  further  use  of  the  response  matrix  R. 


IV.  The  uniform  surveillance  policy.  The  uniform  surveillance 
policy  is  to  searoh  systematically  through  all  Bearch  cells  in  a fixed 
rotation,  that  is,  one  searches  the  J cells  in  order  and  then  repeats 
as  often  as  required.  This  plan  does  not  make  use  of  the  target 
location  probability  distribution  nor  of  the  response  matrix. 


Figure  S-2  illustrates  the  behavior  of  the  above-mentioned  surveillance 
policies  In  one  of  the  cases  (Case  1(a))  considered  in  Chapter  III.  The  target  is 
assumed  to  be  stationary  with  a uniform  prior  distribution,  i.  e. , d(l)  - . 33, 
d(2)  - .33,  and  d(3)  = .34.  For  all  cells,  if  the  target  is  in  the  cell  searehed, 
then  the  probability  of  response  1b  . 8.  The  probability  of  false  response  is  .7 
{ in  the  first  cell  and  the  probability  of  false  response  is  . 1 in  the  second  and  third 

! cells.  This  means  that  very  little  Information  is  gained  by  a search  in  the  first 

cell  since  the  probabilities  of  correct  response  and  false  response  are  nearly 
j equal. 
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Figure  S-2  indicates  that  in  the  earlier  stages  of  search,  there  is  little 
difference  between  the  maximum  information-gain  policy  and  the  optimal  single- 
stage  look-ahead  policy.  Asymptotically  (i.  e. , for  large  k),  the  maximum  infor- 
mation-gain policy  appears  to  have  a slight  advantage.  The  uniform  surveillance 
policy  (Policy  IV)  also  does  well  in  this  example,  but  the  highest  probability  cell 
policy  (Policy  III)  is  not  particularly  attractive. 

Figure  S-3  shows  the  Influence  of  target  motion  on  probability  of  success  for 
die  maximum  Information-gain  policy.  In  this  case  (Case  1(b)),  the  response 
matrix  R is  the  same  as  in  Figure  S-2,  but  the  prior  distribution  Is  non-uniform 
with  d(l)  * .75,  d(2)  = .15,  and  d(3)  - ,10.  The  three  examples  shown  correspond 
to  values  of  the  dispersion  constant,  6 = 0,  6 - .3,  and  6 - 1.  The  case  where 
6*0  corresponds  to  no  target  motion,  and,  consequently,  thie  curve  is  the  same 
as  that  given  in  Figure  S-2  for  this  policy.  The  case  where  6 = 1 corresponds 
to  complete  dispersion  of  the  target  location  probability  distribution  to  a uniform 
distribution  at  each  stage.  The  first  transition  of  the  Markov  process  is  made 
at  the  end  of  the  first  stage.  Therefore,  S(l)  is  identical  for  all  three  values  of 
the  dispersion  constant.  When  0 < 6 < 1,  the  curves  do  not  appear  to  approach  1 
asymptotically.  In  these  cases,  it  appears  that  equilibrium  is  reached  for  large 
values  of  k in  the  sense  that  the  information  gained  by  search  is  balanced  by  the 
information  Lost  by  dispersive  target  motion. 

The  principal  conclusion  of  Chapter  m is  that  the  maximum  information-gain 
policy  appears  to  have  very  desirable  characteristics  in  the  idealized  surveillance 
scenario  considered.  In  all  cases  considered,  it  is  the  best  or  nearly  the  best 
of  all  the  plans  considered,  Moreover,  for  each  alternative  policy,  there  is  at 
least  one  case  given  where  the  maximum  information-gain  policy  1b 
much  better.  This  conclusion  appears  to  be  at  variance  with  some  previous 
investigations  into  the  value  of  Information  theory  in  search  problems;  these 
other  investigations  are  reviewed  briefly  In  the  final  section  of  Chapter  III. 


Appendices 

Appendix  A provides  a generalized  treatment  of  the  information  processing 
concepts  described  and  applied  in  Chapter  II.  Knowledge  of  the  mathematical 
structure  of  these  Information  processing  procedures  makes  it  possible  to  carry 
out  deeper  investigations  of  their  characteristics  and  scope.  An  understanding  of 
Appendix  A,  however,  is  not  required  in  order  to  undertake  the  development  of 
new  processing  systems;  the  last  section  of  Chapter  II  should  suffice  for  this 
purpose. 

Appendix  B formulates  the  Bearch  and  surveillance  problem  as  a statistical 
sequential  experimental  design  problem.  The  purpose  of  this  formulation  is  to 
suggest  a theoretical  framework  for  applying  information  theoretic  concepts  to 
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FOR  THE  MAXIMUM  INFORMATION-GAIN  POLICY 


surveillance  problems.  In  particular,  the  maximum  Information-gain  policy  of 
Chapter  B Is  shown  to  correspond  to  Llndley's  approach  (see  reference  [ f J)  In 
sequential  experimental  design.  It  is  also  shown  that  the  problem  of  which  cell 
to  search  at  eacU  stage  at  a surveillance  operation  may  be  viewed  as  a game 
between  the  search  planner  and  nature  In  which  the  payoff  to  the  search  planner 
Is  measured  In  terms  of  the  information  he  gains  about  the  true  state  of  nature. 


ASW  INFORMATION  PROCESSING  AND  OPTIMAL 
SURVEILLANCE  IN  A FALSE  TARGET  ENVIRONMENT 


CH_APTE_R  _I 
INTRODUCTION 


This  report  addresses  problems  pertaining  to  ASW  information  processing 
and  optimal  surveillance  in  a false  target  environment.  The  objective  is  to 
provide  useful  concepts  and  practical  approaches  for  answering  the  question, 

"Where  is  the  target?".  Potential  applications  Include  central-site  processing 
of  data  from  fixed  surveillance  systems,  VP  mission  planning  and  analysis  at 
Tactical  Support  Centers  (TSCs),  ocean  surveillance,  and  processing  of  diverse 
kinds  of  ASW-related  information  by  computerized  command  and  control  systems 
(ASWCCS,  WWMCCS,  etc.). 

In  this  report,  the  term  sensor  is  used  to  denote  the  entire  sensing  system 
consisting  of  transponder,  processor,  display,  and  human  operator.  A target-like 
sensor  response  results  from  a decision  based  upon  the  inputs  to  the  sensing  system  In 
favor  of  the  hypothesis  that  the  target  is  present  as  opposed  to  the  alternative 
hypothesis  that  the  target  is  not  present.  A target-like  response  may  be  generated 
by  the  target  (a  true  response)  or  by  some  other  non-target-related  mechanism  (a 
false  response). 

A predecessor  report  (reference  ( a })  deals  extensively  with  the  causes  of 
false  responses  and,  among  other  things,  provides  quantitative  models  for 
including  false  responses  in  ASW  computer  simulations. 

The  present  report  assumes  that  the  occurrence  of  false  responses  is  an 
unavoidable  operational  fact  of  life  and  focuses  on  the  problem  of  what  to  do  about 
them. 

We  are  Interested  in  utilizing  the  information  provided  by  sensor  responses 
for  the  purpose  of  making  target  location  predictions.  In  parts  of  an  operating 
area  where  there  are  few  false  response  stimuli,  such  as  those  produced  by  shipping  or 
biological  activity,  a target-like  response  conveys  considerable  information  about 
target  preaenoe.  In  other  areas  which  abound  in  false  response  stimuli,  a single 
target-like  response  has  less  meaning  and  Importance. 
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For  our  purposes,  search  is  defined  as  the  act  of  acquiring  response/no 
response  data  from  the  sensors,  in  most  treatments  of  search  theory,  the 
objective  of  search  is  target  detection,  i.e. , achieving  a state  where  the  target's 
location  (e.  g, , a cell  in  a search  grid)  can  be  stated  with  absolute  certainty. 
Unfortunately,  this  state  is  seldom  reached  with  non-visual  sensors  because 
of  the  possible  occurrence  of  false  responses.  Thus,  new  approaches  are 
required  to  deal  realistically  with  these  situations. 


In  this  report,  in  fact,  the  detection  state  is  not  observable.  That  is,  in  this 
report  it  is  assumed  that  the  decision  maker  can  never  state  "We  have  detected 
the  target.  " He  can  only  become  increasingly  confident  that  the  information  provided 
by  his  sensors  is  consistent  with  a particular  target  looatlon  or  motion  hypothesis. 


Even  within  the  narrow  confines  of  ASW  search  and  surveillance,  there  are 
a wide  variety  of  tactical  situations  which  might  arise  and  which  might  involve 
many  different  types  of  ASW  units,  sensors,  and  systems.  In  order  to  treat  this 
diversity,  we  have  decided  to  emphasize  concepts  rather  than  details.  Our 
intent  is  to  show  the  potential  usefulness  of  certain  ideas  rather  than  to  present 
detailed  algorithms  for  the  Implementation  of  these  ideas  in  specific  situations. 


Chapter  II  discusses  methods  for  centralized  processing  of  diverse  kinds  of 
sensor  data  and  general  intelligence.  These  methods  have  been  applied  without 
false  target  considerations  In  Coast  Guard  search  and  rescue  (SAR)  cases  (Bee 
reference  [b  ])  and  in  certain  ASW  situations.  The  discussion  is  based  upon  an 
idealized  tactical  setting  where  a fixed  distributed  field  of  sonBors  provides  the 
response  data.  These  responses  and  subjective  a priori  information  about  target 
information  are  Input  to  the  processing  system;  the  output  of  the  processing  system 
provides  the  answer  to  the  question,  "Where  Is  the  target?"  in  the  form  of  target 
location  probability  maps.  Illustrations  are  given  for  the  cases  of  targets 
patrolling  on  station,  targets  in  transit,  and  targets  out  of  the  area  of  interest 
entirely.  In  the  latter  case,  all  sensor  Information  is  false  response  information. 


In  Chapter  II  the  subjective  input  takes  the  form  of  scenarios  for  target  motion 
together  with  associated  credences.  The  "weighted  scenario"  Idea  was  introduced 
by  Dr.  John  P.  Craven  during  the  Mediterranean  H-bomb  search  in  1966  and  used 
to  develop  an  a priori  probability  target  looatlon  distribution  for  that  operation. 

The  weighted  scenario  approach  was  used  subsequently  in  the  1968  search  for  the 
submarine  Scorpion  (see  reference  [ g ))  and  is  presently  Incorporated  in  the 
operational  computer-assisted  search  and  rescue  planning  (CASP)  system  of  the 
Const  Guard. 


The  methods  illustrated  in  Chapter  II  also  permit  the  input  of  probability 
distributions  rather  than  single-valued  estimates,  for  parameters  whose  values 
are  uncertain;  the  "true"  values  of  these  parameters  are  estimated  from  the 
sensor  observation  data  concurrent  with  the  determination  of  the  target  Location 
probability  distributions.  Appendix  A supports  the  material  in  Chapter  II  with  a 
more  general  and  abstract  discussion  of  the  information  processing  concepts. 
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Chapter  III  is  concerned  with  optimal  utilization  of  the  information  given  by 

the  target  location  probability  distributions,  and  the  analysis  in  this  chapter  is 
intended  primarily  to  demonstrate  the  potential  applications  of  information  theory 
to  ASW  surveillance  In  a false  target  environment.  In  this  kind  of  environment, 
sensor  responses  do  not  necessarily  indicate  target  presence, but  they  do  provide 
a certain  amount  of  information.  Our  results  indicate  that  this  information  may 
be  quantified,  analyzed  mathematically,  and  usefully  applied  in  terms  of  the 
concepts  of  Information  theory. 

The  tactical  setting  considered  in  Chapter  III  is  an  idealized  ASW  surveillance 
situation  in  which  one  is  interested  in  finding  the  sequential  assignment  of  ASW 
search  which  will  maximize  the  number  of  times  that  the  target’s  position  is 
correctly  specified  over  an  extended  period  of  time.  Four  surveillance  policies 
(i.  e„ , sequential  allocations  of  search  effort)  are  compared  using  monte-carlo 
simulation.  The  policy  which  maximizes  the  expected  information  gain  in  the 
posterior  target  location  probability  distribution  is  found  to  provide  the  best  overall 
results  in  the  cases  examined. 

Previous  studies  (in  particular,  references  [ h J,  [ 1 J,  and  f ) ])  of  the 
connections  between  search  theory  and  information  theory  have  reached  negative 
conclusions.  These  previous  studies  are  reviewed  in  the  final  section  of 
Chapter  III  and  some  reasons  for  the  apparent  disagreement  are  offered. 

Information  theoretic  approaches  have  been  UBed  extensively  in  statistics  (see, 
for  example,  reference  [k  ]);  Appendix  B relates  these  statistical  methods  to  the 
surveillance  problem  from  the  point  of  view  of  sequential  experimental  design 
and  hypothesis  testing. 


CHAPTER  II 


ASW  INFORMATION  PROCESSING  IN  A FALSE  TARGET  ENVIRONMENT 


This  chapter  presents  procedures  for  processing  ASW  Information  in  a false 
target  environment  for  the  purpose  of  predicting  target  location.  The  procedures 
are  computer-oriented  and  suited  for  use  in  command  centers  which  have  aocesB 
to  diverse  kinds  of  ASW  sensor  data  and  intelligence.  Questions  pertaining  to  the 
utilization  of  the  target  location  predictions  are  deferred  to  Chapter  in. 

The  methods  Indicated  in  this  chapter  are  Bayesian  and  are  addressed  primarily 
to  answering  the  question,  "Where  is  the  target?".  The  results  are  displayed  in 
terms  of  target  location  probability  maps  which  are  based  upon  subjective  target- 
mission  scenarios  and  upon  observed  sensor  response  data.  The  maps  express 
the  target  location  probability  distributions  in  terms  of  grid-cell  probabilities. 

Other  useful  results  such  as  the  probability  distributions  for  target  course  and 
speed  could  be  displayed  if  desired  but  are  not  treated  in  this  report. 

Each  ASW  situation  has  its  own  peculiarities,  and  discussion  of  the  information 
processing  methods  in  a way  which  would  oover  all  contingencies  would,  it  is 
believed,  obscure  the  basic  principles.  Therefore,  our  main  purpose  is  to 
demonstrate  the  potential  usefulness  of  the  concepts  in  terms  of  specific  examples 
and  to  provide  a mathematical  framework  for  further  applications. 

Successful  implementation  of  the  methods  will  depend  to  a large  measure  upon 
one's  ability  to  exploit  the  specifics  of  each  application  (target  mission  objectives 
and  patterns  of  operation,  own  systems  characteristics,  crew  proficiency,  etc. ) 
and  to  construct  a mathematical  model  having  a level  of  detail  and  realism 
consistent  with  both  the  data  quality  and  the  constraints  imposed  by  computer  memory 
size  and  computation  speed. 

As  mentioned  above,  the  information  processing  methods  discussed  in  this 
chapter  are  Bayesian.  Briefly  described,  one  begins  by  generating  a large 
collection  of  "constructs,"  ei,  . . . , on.  Each  construct  specifies  a complete  target 
track  as  well  as  any  parameters  of  the  mathematical  model  which  are  not  assumed 
to  be  known  exactly.  For  each  construct  en,  there  is  specified  a prior  probability 
Pq  that  the  n^1  construct  is  correct.  The  prior  probabilities  reflect  the  validity 
of  the  constructs  before  any  information  is  obtained  from  the  various  ASW  sensors. 
Usually,  Pn  » 1/N  when  the  constructs  are  generated  by  monte-carlo  simulation. 
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Sensor  information  is  used  to  update  the  prior  probabilities  for  the  constructs 
in  the  form  of  a posterior  distribution.  This  is  done  as  follows.  Let 


sensor  response  patterns  observed  the  nth  construct 
throughout  the  time  period  of  interest  is  correct 


] 

3 

j 


If  p^  denotes  the  posterior  probability  for  the  nth  construct,  then  according  to 
Bayes's  formula 


P^i  = 


%Pn 


Zm=l  qmpm 


for  n 1, . , . , N, 


The  first  section  illustrates  the  information  processing  methods  by  applying 
them  to  hypothetical  ASW  situations.  The  second  section  presents  the  details  of 
the  mathematical  procedures  used  to  compute  the  illustrations.  Appendix  A 
provides  a generalisation  of  the  information  processing  methods  to  more  general 
situations. 


Illustrative  ASW  Applications 

This  section  Illustrates  the  results  of  applying  certain  methods  for  processing 
subjective  target  motion  scenario  information  and  ASW  sensor  response  data  In 
order  to  obtain  estimates  of  the  target  location  probability  distribution  and  of  certain 
other  parameters  of  interest.  The  methods  themselves  are  postponed  to  the  second 
section.  The  target  location  probability  distribution  permits  one  to  determine  the 
probability  that  the  target  is  contained  within  specific  geographical  regions.  These 
probability  distributions  are  of  central  importance  in  ASW. 

In  this  report,  the  target  location  probability  distribution  is  expressed  in  terms 
of  grid-cell  probabilities  as  illustrated  in  Figure  II- 1.  Charts  such  as  Figure  II- 1 
are  often  referred  to  as  target  location  probability  maps.  In  the  case  Bhown,  there 
Is  a 20%  chance  that  the  target  is  in  cell  C-3  and  a 70%  chance  that  the  target  is  in 
the  region  covered  by  cells  B-3,  C-2,  C-3,  C-4,and  D-3.  Target  location 
probabilities  associated  with  other  regions  may  be  obtained  by  summing  the  appropriate 
probabilities. 

Although  tactical  use  of  the  target  location  probability  distribution  is  not 
discussed  in  this  chapter,  a comment  on  contact  investigation  is  in  order.  The 
usual  objective  of  contact  investigation  is  to  detect  and  further  localize  the  target. 

To  a large  extent  the  target  location  probability  distribution  consolidates  all  of  the 
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relevant  information  needed  to  pursue  this  objective.  The  probability  map 
displays  the  information  pertaining  to  sensor  contacts  combined  with  the  equally 
important  information  pertaining  to  target-mission  objectives  and  patterns  of 
target  operation.  In  many  cases,  therefore,  it  Is  better  to  investigate  areas 
associated  with  updated  target  location  probabilities  rather  than  to  Investigate 
points  associated  with  the  Individual  contacts.  The  usefulness  of  the  latter 
investigation  usually  decreases  rapidly  as  "time  late"  Increases. 

The  first  subsection  below  introduces  the  sensor  response  assumptions. 

The  three  subsections  which  follow  the  first  provide  the  illustrative  numerical 
examples.  These  are  separately  addressed  to  the  cases  where  the  target  is 
(1)  patrolling  station,  (2)  in  transit,  and  (3)  out  of  area. 

Sensor  assumptions.  Figure  II— 2 shows  the  sensor  field  which  will  be  used 
in  all  of  the  examples  given  in  this  section.  The  sensors  are  arranged  in  a 
fixed  rectangular  array  with  60-mlle  spacing  between  rows  and  columns. 

The  term  "sensor  response"  will  be  used  to  Indicate  that  a decision  has  been 
made  that  the  sensor  output  contains  a sufficient  number  of  target-related  cues 
so  that  the  hypothesis  that  the  target  is  present  is  preferred  to  the  alternative 
hypothesis  that  the  target  is  not  present.  A decision-theoretic  dlsoussion  of  this 
determination  is  given  in  detail  in  Chapter  IV  of  reference  [ a ],  and  we  will  not 
be  concerned  further  with  these  details. 

Sensor  response  decisions  might  be  made  by  an  individual  in  charge  of  a 
sensor  team  or,  perhaps,  by  the  programming  logio  of  an  automatic  classification 
device.  The  information  processing  methodology  presented  in  this  chapter  may  be 
particularly  useful  in  the  latter  case  because  the  programming  of  an  automatic 
classification  device  requires  the  explicit  statement  of  classification  decision 
rules.  Such  explicit  rules  are  much  easier  to  deal  with  analytically  than  are  the 
less  explicit  rules  underlying  human  decision  making. 

A "detection"  la  defined  to  be  a sensor  response  caused  by  the  target  and  a 
"false  response"  Is  defined  to  be  a sensor  response  oaused  by  something  other  than 
the  target. 

The  distributed  sensors  are  monitored  at  the  end  of  24-hour  intervals.  Each 
monitoring  event  is  treated  as  a "single  glimpse. " Continuous  field  observations 
could  also  be  modeled  but  would  require  more  complex  algorithms  than  those 
developed  to  compute  the  examples  in  this  section. 

It  is  assumed  that  each  sensor  has  a maximum  detection  range  of  60  miles 
and  that  the  single-sensor,  single -glimpse  detection  probability  is  Pp  = . 8 if  the 
target  comes  within  this  range  of  a sensor.  Probability  of  deteotion  is  assumed 
zero  outside  of  60  miles.  The  single-sensor,  single-glimpse  probability  of  false 
responje  is  Pa  = .3  regardless  of  target  location.  Thus,  if  the  target  is  within 
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60  miles  of  a sensor,  then  the  probability  of  response  is  1 - (1-Pd)  (1-Pa)  ~ . 86, 
and  if  the  target  is  not  within  60  miles  of  a sensor,  then  the  probability  of 
response  is  Pa  = . 3. 

It  is  assumed  for  illustration  that  all  sensor  responses  are  statistically  I ji 

Independent  in  space  and  time.  More  complex  assumptions  could  be  made  if 
desired  in  a real  application.  The  values  Pd  = . 8 and  Pa  * . 3 are  assumed  unknown 
to  the  information  processor.  What  is  known,  however,  is  that  detection  probability  I J 

and  fUlse-response  probability  are  related  by  an  ROC  (receiver  operating  ■: 

characteristic)  relationship  j 


PA  = f(PD) 


where,  for  some  fixed  a > 0, 


f(p)  = p“  for  0 < p < 1. 


Note  that  Pa  increases  with  Pd  and  that  Pa  « 0 when  Pd  = 0 and  Pa  - 1 when 
Pd  * 1.  The  true  values  of  Pd  and  Pa  will  be  estimated  from  the  operationally 
derived  data  as  part  of  the  processing. 

Any  function  relating  Pa  and  Pd  could  be  used  without  significantly  increasing 
the  complexity  of  the  processing  algorithms.  In  fact,  it  would  not  be  difficult  to 
deviBe  an  algorithm  which  would  permit  postulation  of  an  entire  family  of  possible 
ROC  relationships  when  there  is  uncertainty  as  to  which  relationship  is  correct. 

The  correct  relationship  could  then  be  inferred  from  the  operationally  derived  data. 

The  above  assumptions  are  made  in  order  to  illustrate  the  information 
processing  ideas  within  the  framework  of  a simple  and  easily  understood  mathematical 
model.  They  are  not  necessarily  recommended  for  real-world  applications. 
Alternative  models  for  detection  are  provided,  for  example,  in  references  [1  ], 

(m],  [n  ),  [o  ],  and  [p  ];  reference  [a  ] provides  an  hierarchy  of  decision-theoretic 
models  which  treat  detection  and  classification  in  a unified  manner. 

FalBe  responses  result  from  complex  interactions  involving,  among  other 
things,  the  sensor  system,  the  environment,  and  human  factors  (see  reference 
(a  |).  At  the  present  time,  these  interactions  are  not  well  understood  and  many 
of  the  factors  which  are  involved  (e.g, , command  altitudes  and  individual  motivation) 
are  not  physically  observable  or  measurable.  Any  estimate  of  false-response 
probability,  therefore,  could  be  In  error  by  a significant  amount.  For  this  reason, 
it  is  important  to  develop  information  processing  procedures  which  do  not  require 
exact  knowledge  of  false-response  probabilities  and  which  are  adaptive  in  the  sense 
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that  initial  estimates  of  these  probabilities  can  be  modified  by  observed  sensor 
responses. 

In  order  to  reflect  initial  uncertainty  about  target  characteristics,  sensor 
capabilities,  and  environmental  conditions,  therefore,  we  shall  assume  a uniform 
probability  distribution  (known  to  the  information  processor)  for  Pd  on  the 
Interval  between  . 5 and  . 9.  The  oxpected  value  of  this  probability  distribution  is 
. 7 and  PA  is  determined  from  Pd  by  means  of  the  ROC  function.  (See  references 
[q  ] and  [ r]  for  related  analyses  when  the  target  is  stationary  and  sensor  capa- 
bilities are  not  known  precisely. ) 

Example  1 — target  patrolling  station.  This  example  applies  to  the  case  of 
a target  patrolling  station.  It  is  assumed  that  scenarios  cah  be  postulated  for 
target  motion  based  upon  past  observations  of  similar  targets  or  knowledge  of 
the  present  target's  mission  objectives.  Associated  with  each  scenario  is  a credence 
which  expresses  the  scenario's  relative  plausibility.  In  the  present  example,  a 
scenario  specifies  a probability  distribution  for  the  target's  location  at  equally 
spaced  points  in  time.  The  target  is  assumed  to  move  along  legs  with  constant 
course  and  speed  between  leg  endpoints.  Monte-carlo  procedures  are  used  to 
obtain  a large  number  of  sample  target  tracks  for  each  scenario  specified  (more 
details  are  given  in  the  second  section).  The  number  of  tricks  generated  for  each 
scenario  is  proportional  to  the  associated  credenoe.  A particular  target  track  is 
generated  by  randomly  drawing  the  endpoints  of  each  track  leg  from  the  specified 
endpoint  probability  distributions, 

Figure  II— 3 presents  the  scenarios  chosen  for  this  example.  Scenario  1 with 
oredonce  . 33  describes  a target  patrolling  in  a olookwise  direction  beginning  in 
the  Bouth  and  moving  west,  then  north,  and  then  east.  All  track-leg  endpoint 
probability  distributions  are  assumed  to  be  circular  normal  with  30-mile  standard 
deviations.  Scenario  2 with  credence  .33  describes  a target  which  also  is 
patrolling  in  a clockwise  direction,  but  beginning  in  the  north  and  moving  east 
and  then  south  and  west.  The  endpoint  probability  distributions  are  also  normal 
with  30-mile  standard  deviations.  Scenario  3 with  credence  . 34  describes  a 
target  which  is  patrolling  in  the  center  of  the  area  without  a regular  pattern  of 
motion.  For  scenario  3,  all  track-leg  endpoint  probability  distributions  have 
identical  normal  distributions  with  60-mile  standard  deviations. 

Figure  n-4  shows  the  time  history  of  responses  from  the  distributed  field 
simulated  by  a single  replication  of  monte  carlo.  The  target  Is  assumed  to  follow 
Scenario  2 and  Its  position  as  a function  of  time  is  also  shown  in  Figure  II-4. 

The  positions  were  chosen  to  coincide  with  the  means  of  the  Scenario  2 distri- 
butions. A 60-mlle  radius  circle  indicating  sensor  detection  range  is  drawn  about 
the  target's  position  so  that  the  responses  outside  this  circle  (necessarily  false) 
may  easily  be  Identified. 
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Notes  Circles  indicate  the  2c-uncertainty  in  the  circular  normal 
distribution  for  target  position  at  the  endpoints  of  each  lag. 


(a)  Scenario  X (Credence  ■ . 33) 


(b)  Scenario  2 (Credence  ■ . 33) 


FIGURE  II-4 
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THE  HISTORY  OF  SENSOR  RESPONSES 
EXAMPLE  1 (TARGET  PATROLLING  STATION) 

Notes:  (1)  Target  follows  Scenario  2 of  Example  1. 

(2)  Detection  probability  is  . 8 and  false-response  probability  is  . 3. 

(3)  x indicates  target  position  (note  60  mi  detection  circle), 

0 Indicates  a sensor  response,  and 

Q indicates  no  sensor  response.  T 

(a)  24  hours  along  track  (b)  48  hours  along  track 
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Figures  El— 5 and  II— 6 show  the  target  looation  probability  distributions  before 
and  after  processing  the  sensor  response  patterns  shown  in  Figure  II-4  (500 
monte -carlo  replications'"  were  used  to  produce  these  distributions).  Figure  II-S 
shows  the  target  looation  probability  distributions  based  only  upon  the  weighted 
scenarios;  no  sensor  response  information  Is  Incorporated.  Figure  II— 6 shows 
the  distributions  which  result  from  incorporating  sensor  response  information. 
Figure  II-5(a)  shows  the  target  location  probability  distribution  for  the  target 
24  hours  along  its  track  based  solely  upon  the  scenario  formulations  with  no 
incorporation  of  the  sensor  response  patterns.  Note  that  the  target  actually  lies 
on  the  boundary  between  cells  B-3  and  B-4  and  that  the  sum  of  the  probabilities 
in  these  two  cells  is  27%, 

Figure  n-6(a)  shows  the  target  location  probability  distribution  for  the  same 
time  as  Figure  II-5(a)  but  updated  by  incorporation  of  the  sensor  response  patterns 
shown  in  Figure  II-4(a),  Note  that  the  sum  of  probabilities  in  the  cells  B-3  and  B-4, 
which  Include  the  target,  remains  about  the  same  (28%)  but  that  the  probability 
distribution  has  become  more  concentrated.  It  should  be  noted  from  Figure  II-4(a) 
that  there  were  9 false  responses  from  the  23  sensors  beyond  deteotlon  range  of  the 
target.  This  is  somewhat  higher  than  the  6 or  7 false  responses  expeoted  based 
upon  the  assumed  value  of  Pa  “ .3  for  the  false -response  probability. 

Figure  U-5(b)  shows  the  target  looation  probability  distribution  for  the  target 
at  48  hours  along  its  track  based  solely  upon  the  scenario  formulations  with  no 
incorporation  of  any  sensor  response  patterns.  The  sum  of  probabilities  in  the 
cells  B-4  and  B-5  containing  the  target  is  22%. 

Figure  H-8(b)  shows  the  updated  target  location  probability  distribution  at 
48  hours  along  the  track,  incorporating  the  sensor  response  patterns  shown  In 
Figure  n-4(a)  and  in  Figure  II-4(b).  The  sensor  response  pattern  given  by 
Figure  II-4(b)  is  the  result  of  very  "bad  luck. " Many  false  responses  were  obtained 
in  the  areas  occupied  by  targets  following  Scenarios  1 and  3 while  at  the  same  time 
few  responses  were  obtained  in  the  area  occupied  by  targets  following  Scenario  2. 
The  actual  target  (following  Scenario  2)  was  detected  only  once  out  of  two 
opportunities. 

As  a result,  the  sum  of  probabilities  in  the  cells  B-4  and  B-5  containing  the 
target  decreases  to  3%.  Action  based  on  the  results  at  this  stage  would  not  have 
much  chance  of  success. 

Figure  U-5(c)  shows  the  target  location  probability  distribution  for  the  target 
at  72  hours  along  its  track  based  solely  upon  the  scenario  formulations.  The 
sum  of  probabilities  in  cells  04  and  05  containing  the  target  is  33%. 


For  operational  real-time  applications,  a much  larger  number  of  replications 
is  suggested.  In  past  utilization  of  similar  systems,  2, 000  to  10, 000 
replications  have  been  employed  routinely. 
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Figure  I!  -6(c)  shows  the  corresponding  updated  target  location  probability 
distribution  incorporating  all  the  sonsor  response  patterns  up  to  and  including 
those  shown  in  Figure  II-4(c),  The  sum  of  probabilities  in  cells  C-4  and  C-5 
is  now  seen  to  increase  dramatically  to  85%.  Apparently,  a sufficient  number 
of  patterns  have  been  processed  at  this  point  for  the  updated  target  location 
distribution  to  begin  to  converge  on  the  target's  actual  location. 

Finally,  Figures  II-5  (d)  and  II-6(d)  provide  the  before  and  after  comparisons 
corresponding  to  the  target  at  96  hours  along  its  track.  Without  use  of  sensor 
response  patterns,  the  sum  of  the  probabilities  in  cells  D-4  and  D-5  containing 
the  target  is  26%  as  given  in  Figure  II-5(d).  After  use  of  all  sensor  response 
patterns  shown  in  Figure  II-4,  the  sum  of  the  probabilities  in  cells  D-4  and  D-5 
Is  97%. 

Table  II- 1 shows  the  influence  of  the  senBor  responses  on  the  scenario 
credences  and  the  mean  detection  probability.  Recall  that  Initially  the  scenario 
credences  were  assumed  equal;  the  sensor  detection  probability  was  assumed 
uniformly  distributed  between  . 5 and  . 9 with  mean  value  of  . 7.  Processing  of 
the  sensor  response  patterns  at  24  hours  and  48  hours  decreases  the  credence 
associated  with  Scenario  2 from  , 330  to  . 211  and  . 044,  respectively.  The 
results  improve  as  the  response  patterns  for  72  hours  and  96  hours  are  Incorporated. 
The  updated  weight  for  Scenario  2,  the  actual  scenario,  rises  to  . 952  following 
Incorporation  of  the  sensor  responses  obtained  at  96  hours. 

The  mean  detection  probability  Pd  before  processing  Is  . 7.  In  the  present 
example,  the  actual  but  unknown  detection  probability  Is  Pd  - . 8.  After 
processing  all  the  sensor  responses,  the  updated  mean  detection  probability  is  .77. 

Thus,  in  spite  of  an  unknown  and  relatively  high  false-response  probability, 
the  processing  algorithms  produce  {in  this  example)  a very  accurate  indication 
of  the  true  target  scenario  and  single-sensor  detection  probability.  Moreovor, 
after  processing  the  sensor  response  patterns,  the  target  location  probability 
distribution  becomes  quite  concentrated  about  the  true  location  of  the  target. 

Example  2 — target  in  transit.  This  example  considers  the  problem  of 
localizing  a target  as  It  transits  through  an  area  covered  by  the  distributed  senior 
field.  The  target  (if  it  shows  up)  is  expected  to  begin  its  transit  through  the  area 
between  time  0 and  time  72  hours,  but  the  exact  time  of  transit  Is  unknown.  It  is 
desired  to  use  the  sonsor  response  patterns  to  detect  the  target's  presence  in  the 
area  and  to  localize  it  as  it  moves  through. 

Figure  n-7  presents  the  scenarios  formulated  for  this  example.  Once  again, 
the  location  of  the  target  at  the  endpoint  of  eaoh  leg  is  specified  by  a normal 
probability  distribution.  The  distributions  are  elongated  in  the  east- west  direction, 
however,  In  order  to  represent  the  uncertainty  in  the  target's  location  across  a 
"front. " Tho  standard  deviation  in  the  oast-west  direction  is  60  miles  and  the 
standard  deviation  In  tho  north-south  direction  is  30  miles. 
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FIGURE  II -5 


TARGET  LOCATION  PROBABILITY  DISTRIBUTIONS 


EXAMPLE  1 (TARGET  PATROLLING  STATION)  - NO  SENSOR  INFORMATION  USED 


Notes:  (I)  Target  follows  Scenario  2 of  Example  1. 

(2)  Detection  probability  is  . 8 and  false- response  probability  Is  . 3. 

(3)  x indicates  target  position  (note  80  ml  detection  circle). 


(a)  24  hours 


(b)  48  hours 


FIGURE  II -6 

TARGET  LOCATION  PROBABILITY  DISTRIBUTIONS 
EXAMPLE  1 (TARGET  PATROLLING  STATION)  - ALL  SENSOR  INFORMATION  USED 


Notes:  (1)  Target  follows  Scenario  2 of  Example  1. 

(2)  Detection  probability  is  . 8 and  false -response  probability  iB  .3. 

(3)  x indicates  target  position  (note  60  mi  detection  oirole). 


JAILklJLL'i. 

THE  INFLUENCE  OF  SENSOR  RESPONSES  ON  ESTIMATED  PARAMETER  VALUES 
EXAMPLE  1 (TARGET  PATROLLING  STATION) 


Notes:  (1)  Target  follows  Scenario  2 of  Example  1. 

(2)  True  single -sensor,  single-glimpse 
detection  probability  is  . 8 and 
false-response  probability  is  .3. 


Initial  Assumptions 

Scenario  Credences 
Scenario:  123 

.330  .330  .340 

Moan  of  Single-Sensor, 
Single-Glimpse  Detection 
Probability  Distribution 

(true  value  is  . 8)  • 

'l 

. 700 

24  hra 

.299  .211  .490 

.823 

Time 

of 

Field 

| 48  hrs 

.525  .044  .431 

.811  ^ 

i 

calculation  not  available 

Response 

| 72  hrs 

.030  .712  .252 

96  hrs 

.001  .952  .047 

.770  1 
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SCENARIOS  FOR  TARGET  MOTION 
EXAMPLES  2 AND  a (TARGET  IN  TRANSIT! 


Notes:  (1)  Ellipses  iidicate  the  2cr -uncertainty  in  target  position  at 
the  endpoints  of  each  leg. 

(2)  Scenario  5 (Credence  * . 2)  corresponds  to  no  target  transit 
during  the  time  period  of  interest. 


Scenarios  1 through  4 differ  only  in  the  assumed  time  of  entry  into  the  area, 
i.e.,  Scenarios  1 through  4 are  based,  respectively,  on  the  target  reaching  the 
midpoint  of  the  first  row  of  cells  at  0 hours,  24  hours,  48  hours,  and  72  hours. 
Scenario  5 (not  shown)  is  included  to  cover  the  contingency  that  there  is  no  target 
transit  during  the  time  of  interest.  The  initial  credences  for  all  five  scenarios 
are  equal. 

In  certain  cases,  the  processing  algorithms  will  produce  good  results  even 
when  the  actual  target  motion  does  not  conform  to  any  of  the  scenarios  specified. 

In  order  to  demonstrate  this  fact,  the  target  is  assumed  actually  to  begin 
penetration  at  12  hours.  This  is  midway  between  the  assumptions  of  Scenario  1 
and  Soenarlo  2 . 

Figure  n-8  shows  the  target's  actual  position  and  the  sensor  response 
patterns  at  24-hour  intervals.  The  sensor  response  assumptions  are  the  same 
as  those  used  in  Example  1. 

Figures  U-9  and  H-10  are  based  on  bOO  monte -carlo  replications. 

Figure  U-9  shows  the  target  location  probability  distributions  when  no  sensor 
response  patterns  are  processed  by  the  system.  Probabilities  associated  with 
locations  outside  of  the  grid  are  not  shown.  These  probability  distributions  are 
based  solely  upon  the  initial  scenario  formulations.  Note  that  the  actual  target's 
position  lies  within  a cell  with  probability  . 03  throughout  the  transit,  in  order 
to  demonstrate  the  flexibility  of  the  algorithms,  the  computation  of  this  example 
wub  based  upon  the  assumption  that  the  endpoint  probability  distributions  are 
correlated  so  that  the  simulated  target  tracks  through  the  aroa  will  be  straight 
lines  (no  zig  zags).  This  accounts  for  the  fact  that  the  target  location  probability 
distributions  in  Figure  H-9  have  the  appearance  of  a single  distribution  sliding 
through  the  area. 

Figure  11-10  shows  the  target  location  probability  distributions  resulting  from 
processing  all  sensor  response  patterns.  Note  that  the  target  is  located  In  cells 
having  relatively  large  probabilities  and  that  the  probability  distributions  are 
much  more  concentrated  than  was  the  case  in  Figure  U-9.  It  is  also  of  Interest 
to  contrast  the  probability  that  a transit  has  begun  (given  in  tho  notes  corresponding 
to  each  time  period)  with  tho  Initial  probabilities  based  on  tho  scenarios  only 
(given  by  the  first  general  note).  Once  the  targot  actually  penetrates  the  area, 
these  probabilities  arc  substantially  higher  than  the  corresponding  probabilities 
based  upon  the  Initial  scenario  assumptions  alone.  For  example,  at  24  hours, 
the  probability  is  .40  that  a transit  has  begun  based  upon  the  sconarlos  only. 

The  corresponding  probability  making  use  of  tho  sonsor  responses  is  .92. 

It  Is  also  Interesting  to  contrast  Figure  II-lO(o)  with  Figure  II-9(e).  The 
target  has  completed  transit  of  the  area  at  this  time;  this  is  quite  apparent  in 
Figure  II-10(e)  which  shows  only  2','f  probability  of  tho  target  being  in  the  area. 
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Table  11-2  shows  the  influence  of  the  sensor  responses  on  the  scenario 
credences  and  the  mean  detection  probability.  Recalling  that  the  target  enters 
the  grid  area  midway  between  the  times  specified  by  SoenarioB  1 and  2,  we  see 
that  at  times  corresponding  to  24  hours  and  thereafter,  Scenarios  1 and  2 
together  account  for  more  than  90%  of  the  total  scenario  probability.  The  sum 
of  initial  credences  for  these  scenarios  is  .40. 


As  in  Example  1,  the  mean  of  the  detection  probability  distribution  appears 
to  be  converging  towards  the  true  value  . 8, 


Example  3 — target  out  of  grid  area.  This  example  is  based  upon  the  same 
scenarios  as  Example  2 but  corresponds  to  the  case  where  no  target  penetrates 
the  area,  i.e.,  where  Scenario  5 is  the  correct  scenario.  As  in  Examples  1 
and  2,  500  monte-carlo  replications  are  used. 


Figure  n-11  shows  the  patterns  of  Bensor  responses  which,  under  the 
present  assumptions,  are  all  false.  No  target  location  probability  distributions 
were  computed  for  this  example. 


Table  11-3  shows  the  influence  of  the  sensor  responses  on  the  scenario 
credences  and  the  mean  of  the  detection  probability  distribution.  The  shaded 
area  in  the  table  corresponds  to  scenarios  specifying  that  the  target  haB  not  yet 
entered  the  area.  Note  that  as  more  sensor  response  patterns  are  processed, 
the  probability  tends  to  shift  towards  the  "shaded"  region  and  that  at  the  end  of 
96  hours  the  largest  scenario  credence  is  associated  with  Scenario  5— the 
correct  scenario. 


As  in  Examples  1 and  2,  the  mean  of  the  detection  probability  distribution 
appears  to  be  converging  towards  the  correct  value  of  . 8.  Use  of  the  ROC  curve 
permits  detection  probability  to  be  estimated  from  faise-rosponse  data  when  the 
target  is  not  in  the  area. 


Information  ProoeBBlng  Procedures 


This  section  describes  the  Information  processing  procedures  used  to  obtain 
the  results  given  in  the  examples  in  the  preceding  sootion,  A more  general 
treatment  Is  given  in  Appendix  A,  Our  purpose  here  is  to  explain  the  concepts 
in  terms  of  the  simple  model  used  in  the  preceding  section  so  that  the  reader  may 
construct  suitable  models  for  other  applications. 
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The  processing  system  consists  of  two  Information  input  files,  SCENE  and 
DETECT,  two  state  information  files,  UFILE  and  WGHT,  and  four  computer 
programs,  START,  MAP,  TRANS,  and  OBSERV,  which  operate  on  the  state 
information  files.  These  files  and  computer  programs  at?  discussed  in  the 
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Notes:  (1)  Target  is  a 12 -hour  late  Scenario  1 or  a 12 -hour  early  Scenario  2 of  Example  2. 

(2)  Detection  probability  is  . 8 and  false-response  probability  Is  . 3. 

(3)  x Indicates  target  position  (note  60  mi.  detection  oirole), 

• indicates  a sensor  response,  and 

Oindicates  no  sensor  response.  . 

r 

(a)  0 hours  I (b)  24  hourB 

Note:  Target  outside  grid. 

123456  12  3 4 6 6 
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Notes:  (1)  Assumed  prior  probabilities  that  target  transit  has  begun: 
Time  Ours):  0 24  48  72  96 

Probability:  .20  .40  .60  .80  .80 

(2)  Only  probabilities  within  the  grid  are  shown. 

(3)  x indicates  target  position.  * 


(a)  0 hours 

Note:  Target  is  north  of  grid. 
2 3 4 5 


(b)  24  hours 
3 4 6 
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(e)  96  hourB 

Note:  Target  is  south  of  grid. 
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(o)  48  hours 

Note:  Estimated  probability  that 
transit  has  begun  Is  . 94. 


(d)  72  hours 

Note:  Estimated  probability  that 
transit  has  begun  is  .97. 
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(e)  96  hourB 

Notes:  (1)  Target  is  south  of  grid. 

(2)  Estimated  probability  that 
transit  has  begun  Is  . 95. 
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Notes:  (1)  Target  is  out  of  grid  area  oovered  by  sensors. 

(2)  All  responses  are  false.  The  false  response 
probability  is  . 3. 

(3)  A indicates  a sensor  response,  and 
Q indicates  no  sensor  response, 
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THE  INFLUENCE  OF  SENSOR  RESPONSES  ON  ESTIMATED  PARAMETER  VALUES 
EXAMPLE  3 (TARGET  OUT  OF  GRID  AREA) 


Notes;  (1)  Target  is  out  of  grid  area  (Scenario  S of  Example  3). 
All  sensor  responses  arc  false  responses, 

(2)  True  single -sensor,  single-glimpse  detection 
probability  is  .8  and  false -response  probability  is  .3. 

(3)  Shading  indicates  scenarios  placing  target  out  of 
area  at  the  specified  timeB. 


Scenario  Credences 
Scenario:  1234 


Mean  of  Slnglo-Sensor, 
Single-Glimpse  Detection 
Probability  Distribution 
5 (true  value  1b  . 8) 


Initial  Assumptions 


.20  .20  .20  .20  .20 
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following  subsection*.  A processing  system  flow  chart  is  provided  in  Figure 
H-12 . The  file  UFILE  contains  the  "constructs"  mentioned  in  the  heuristic 
description  given  at  the  beginning  of  the  chapter.  The  file  WGHT  contains 
numbers  ("weights")  which  are  proportional  to  the  posterior  probabilities  of 
the  constructs. 


All  scenario  Information  is  stored  in  the 


information  input  file  SCENE.  This  information  consists  of  the  value  of  the 
credence  cj  for  the  Ith  scenario  for  all  1 < 1 < I*  the  time  6 specified  for  the 


target  to  complete  each  of  the  K single  legs,  and  the  parameters  for  the  traok- 


leg  endpoint  probability  distributions  A{(k)  for  0 < k < K and  1 < i < I.  Here, 
kd  indicates  the  endpoint  time  and  i indicate*  the  scenario. 


The  information  input  file  DETECT  provides  the  known  parameters 
characterising  the  detection  mechanism  (i.e,,  the  bounds  on  $d  <u>d  the  parameters 
of  the  ROC  function. 


Files  UFILE  and  WQHT.  The  file  UFILE  contains  Nr  records  which 
provide  the  samples  from  the  monte  -carlo  simulation  of  target  position  and 
other  parameters.  The  contents  of  UFILE  vary  with  time  and,  therefore,  it  lo 
convenient  to  let  UFILE  (t)  denote  the  contents  of  UFILE  at  simulation  time  t. 
Each  of  tho  Nr  reoords  in  UFILE  (t)  contains  statistical  sample  values  for  the 
following  random  variables  t 


! 


Z}(t)  « target's  latitude  (degrees)  at  time  t 

*2<t)  “ target's  longitude  (degrees)  at  time  t 

Bjft)  = target's  velocity  component  (dogrees/hour) 
in  the  north-south  direction 

S2(t)  « target's  velocity  cotijK'acnt  idegrees/hour) 
in  the  east-west  direction 

1 * target’s  scenario  index 

Pq  « target's  probability  of  being  detected  by  a 
single  sensor  on  a single  glimpse  given  that 
target  is  within  detection  range  of  the  sensor, 


WGHT(t)  denotes  the  contents  of  file  WGHT  at  simulation  time  t and  contains 
Nr  records,  each  providing  the  "weight"  for  the  corresponding  record  of  UFILE  (t). 
The  weights  are  calculated  using  Bayes's  formula  and  indicate  the  extent  to  which 
the  records  of  UFILE (t)  are  consistent  with  the  observed  senBor  field  responses. 
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figu_R_e_L^12 

PROCESSING  SYSTEM  FLOW  CHART 
(a)  Initial  Processing  Step 
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Large  weights  correspond  to  a high  degree  of  consistency.  The  method  of 
computing  the  weights  is  discussed  below. 


Program  START.  The  computer  program  START  creates  UFILE(0)  and 
WQHT(O).  To  do  this,  it  uses  the  data  in  the  scenario  input  file  SCENE  and  the 
data  in  the  detection  characteristics  input  file  DETECT. 

The  records  of  UFILE(O)  are  created  one  after  the  other.  To  create  a 
given  record  (say,  the  ntfl  record),  a sample  scenario  index  f111  is  drawn  in 
accordance  with  the  prescribed  credences. 

For  the  n^h  record,  the  values  £°(0)  and  zjjiO)  of  the  random  variables  zj(0) 
and  Z2(°)  are  found  by  sampling  from  the  probability  distribution  A'pi(O). 

The  speed  components  sj(0)  and  s|(0)  corresponding  to  the  random  variables 
8^(0)  and  s2(0)  are  found  by  sampling  from  A^n(l)to  obtain  $°(£)  and  and 
then  computing  1 2 

- l (zj(6)  - zj(0)) 

and 

“2<°>  “ l 

The  nth  record  of  UFILE(0)  is  completed  by  determining  the  sample  value 
Pp  for  the  random  variable  Pd  . This  is  done  by  taking  the  minimum  value  A 
and  the  maximum  value  B for  f»D  from  the  input  file  DETECT  and  computing 

P&  = A + t(B-A) 

where,  here  and  in  what  follows,  * denotes  an  independent  draw  from  a uniform 
distribution  on  the  interval  [0, 1]. 

The  file  WOHT(O)  is  generated  by  program  START  so  that  all  weights  are 
equal  to  unity,  i.e.,  if  wn(0)  denotes  the  content  of  the  record  of  WGHT(O), 
then 


w11^)  = 1 for  1 < n < Nr . 


(II-l) 
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Equation  (II— 1)  reflects  the  fact  that  all  rcoords  of  UFILE  (0)  are  considered  to  be 
equally  likely  a priori.  The  weights  will  change,  however,  whenever  information 
is  obtained  by  observing  the  sensors. 
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Program  TRANS.  The  computer  program  TRANS  updates  file  UFILE  to 
refleot  target  motion  in  accordance  with  the  scenario  information  provided  in 
the  scenario  input  file  SCENE.  No  change  in  file  WQHT  is  made  by  TRANS 
since  no  new  sensor  information  is  input  to  the  system  during  this  operation. 

In  order  for  TRANS  to  update  UFILE,  it  is  assumed  that  the  tracks  between 
leg  endpoints  are  straight  when  expressed  in  coordinates  of  latitude  and 
longitude.  That  is,  if  ti  and  t2  are  times  corresponding  to  target  positions  on 
the  same  leg,  then  for  I » 1 and  2 and  for  tj  < t < tg , 


t2-t  . t-ti  - 

2f<*>  " iJTTj  V‘i>  * MV- 


Suppose  that  TRANS  is  to  update  UFILE  from  simulation  time  t1  to  simulation 
time  t2.  For  the  nth  record  of  UFILE (t),  let  p and  v be  chosen  so  that  ^ < (p+1)  6 
and  vfi  < t2  < (v+l)5  . 

If  v * p,  then  the  target  has  not  moved  to  another  leg;  consequently,  for  l - 1 and  2 


zf(t2>  - $“#!>  + (tg-t^^tj) 


and 


»"<y  - •?<*»>• 


If  v - p+1,  then  the  target  has  moved  to  the  next  leg  and  one  must  sample 
from  the  probability  distribution  A^n(v+1)  in  order  to  obtain  the  target's  position 
zj*(v5  +6))  at  time  (v+1)  6 . Since,  for  v - p +1  and  / * 1 and  2, 

z”(vfi)  = + (vS-tj)  S^tj), 

tho  target's  position  at  time  v6  as  well  as  (vm-1)  5 is  known,  and,  therefore,  the 
velocity  components  on  the  leg  between  times  v6  and  {v  +1)  6 can  be  computed. 
Thus,  for  / ~ 1 and  2, 
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(II-2) 


s?<t2)  = 


z^(v»5  +5)  - z”(v6) 


and  target  position  at  time  t2  is  given  by 


Z°(t2)  = zf(vfi)  + (t2  - v6)  s^tg). 


(H-3) 


Finally,  if  v > p+2,  then  the  target's  position  at  time  tg  is  statistically 
Independent  of  the  target's  position  at  time  tj.  The  velocity  and  position  of  the 
target  at  time  t2  are  found  by  sampling  the  probability  distributions  A^n(vd)  and 
A^n(v6  + 6)  to  determine  zn(v><5)  and  zn(v6  + 5)  for  l - 1 and  2 and  then  by  using 
equations  (II-2)  and  (H-3), 

Program  OBSERV.  The  computer  program  OBSERV  updates  file  WQHT  to 
refleot  information  gained  from  the  sensor  field.  Suppose  that  WGHT(t^)  is  to 
be  updated  to  time  t2.  Let  t^,  denote  the  times  at  which  the  sensor 

field  is  observed  between  times  t*  and  tg  and  assume  that  tj  < Tj  t2  < ■ • • < < t2  . 

Let  t’  ■=  tj  and  t1  = t1.  The  first  step  is  to  update  UFILE(t')  to  time  t\  This  is 
done  using  the  program  TRANS  described  above.  Then  the  updated  file  UFILE(t'), 
file  WGHT(t'),  and  file  DETECT  are  input  to  program  OBSERV.  The  weight 
w^t')  corresponding  to  the  nth  record  of  WGHT(t')  is  determined  from  $n(t') 
corresponding  to  the  nth  record  of  WGHT(t')  by  multiplication  by  the  conditional 
probability  of  observing  the  actual  field  responses.  That  is,  \0n(r')  1b  computed 
by  the  formula  (essentially  Bayes's  formula  without  normalization) 


$n<r’)  - ^(t'Ml-P^]1!  [(l-f^Xl-Pft))^  [pn,^  (1_(1_^n)(1_^n)1t4  (II.4) 

An  iv 

where  the  detection  probability  is  taken  from  the  n record  of  UFILE(t') 
and  the  false-responae  probability  - f(p£))  l0  determined  from  Ppby  use  of 
the  "ROC"  function  f.  In  our  examples,  f is  defined  for  simplicity  by  f(p)  = pa ; 
the  parameter  a describing  f is  obtained  from  the  input  file  DETECT. 

The  exponents  1 . . . , depend  upon  the  position  of  the  target  (z”(t'),  z2(t')) 
given  in  the  nth  record  of  UFILE(t').  The  exponents  are  defined  as  follows: 
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t-j  is  the  number  of  non- responding  sensors  which 
are  beyond  detection  range  of  the  target, 

i<2  is  the  number  of  non-responding  sensors  which 
are  within  detection  range  of  the  target, 

1 3 is  the  number  of  responding  sensors  which 
are  beyond  detection  range  of  the  target,  and 

t4  1b  the  number  of  re/  ponding  sensors  which 
are  within  detection  range  of  the  target. 


Once  files  IJFILE(tv_j)  and  WGHT(r,»_i)  are  completed  for  any  2<  v < rj , 
files  UFILEfTy)  and  WGHT(t„)  are  obtained  by  repeating  the  procedure  described 
above  with  t*  = rv  _i  and  t*  = r v . 

When  files  UFILEfr^)  and  WGHTfT^)  are  obtained,  the  computation  is 
complete  if  tg  = . If  t2  > , then  the  final  update  oonsists  of  using  TRANS  to 

operate  on  file  TJFILE(tt))  in  order  to  generate  UFILEfl^).  Since  no  new 
observations  occur  between  times  and  t2 , WGHT(t)  is  a replioa  of  WGHT(tT)). 

Program  MAP  and  other  output.  Let  the  random  vector  U(t)  be  defined  by 


U(t)  = (zj(t),  z2(t),  Sjft),  s2(t),  k,  Pd). 

Each  record  Un(t)  = (zj(t),  z£(t),  sj(t),  sg<t),  kn,  p£)  of  UFILE (t)  is  then  an 
independent  sample  of  U(t). 

Any  probability  statement  associated  with  the  random  variable  U(t), 
conditioned  upon  observation  of  the  sensors,  may  be  estimated  using  files  TJFILE(t) 
and  WGHT(t)  and  the  formula  (B  is  a set  representing  an  event) 
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To  do  this,  MAP  accepts  inputs  which  define  a grid  over  the  geographical  region 
of  interest.  This  grid  might  consist,  for  example,  of  cells  each  covering  one 
degree  of  latltu: .'2  si'.d  one  de./ree  of  longitude. 


A 


Let  the  grid  cells  b > donated  t”i  for  1 < j < J and  let  the  corresponding 
target  location  probabilities  dete>  mined  by  MAP  from  files  UFILE(t)  and  WGHT(t) 
be  denoted  (t)  for  1 < j < J.  Then  the  values  of  L,  are  computed  by  MAP  for  B 
defined  by 

B = {(blt  ....  b6)|(b1,ba)€Gj}. 

The  updated  credence  for  the  scenario  is  given  by  equation  (II-5)  for  B 
defined  by 


B = {<blt  bg)  | bg  - i} 


Finally,  the  mean  of  the  updated  probability  distribution  for  detection  probability 


is  computed  by 


Exp[Pj)  | sensor  observations]  » 


An 

E PoWD(t) 

n=l 

Nr 

2 wn(t) 
n=l 
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CHA.PTE_R._HI. 

THE  APPLICATION  OF  INFORMATION  THEORY  TO  OPTIMAL  SURVEILLANCE 
IN  A FALSE  TARGET  ENVIRONMENT— AN  EXPLORATORY  ANALYSIS 


This  chapter  examines  in  an  exploratory  way  the  application  of  Information 
theory  to  optimal  allocation  of  surveillance  resources  in  a false  target 
environment.  The  objective  here  is  to  investigate  methods  of  allocating  ASW 
resources  for  the  purpose  of  shaping  the  target  location  probability  distribution 
to  serve  certain  tactically  useful  purposes.  The  preceding  chapter  provides 
methods  for  updating  the  target  location  probability  distribution  to  incorporate 
the  sensor  information  resulting  from  these  allocations. 

We  have  been  motivated  by  a strong  heuristic  attraction  to  policies  which 
build  up  the  information  content  of  the  target  location  probability  distribution. 

We  are  aware,  however,  that  much  of  this  attraction  is  due  to  semantioB  (i.e., 
the  fact  that  the  language  of  information  theory  is  so  suggestive  in  the  present 
context)  and  we  have  tried  to  exhibit  more  substantive  reasons  why  further 
development  of  ' maximum  information-gain  policies"  may  be  desirable  from 
an  operational  point  of  view. 

The  term  "optimal"  Is  used  in  the  title  of  this  chapter  to  reflect  a desire 
rather  than  to  state  an  accomplishment.  We  desire  to  find  the  best  surveillance 
policy  within  the  context  of  our  tactical  scenario,  and,  to  this  end,  we  formulate 
several  policies  and  examino  their  properties.  The  policy  based  on  maximizing 
the  information  content  of  the  target  location  probability  distribution  appears  to 
be  close  to  optimal  (among  those  plans  considered)  In  ail  cases  examined. 

Further  work  is  required,  however,  before  more  precise  statements  can  be  made. 

The  investigation  is  approached  numerically  and  theoretically.  The  numerical 
work  is  based  upon  monte-carlo  simulation  of  the  properties  of  selected 
surveillance  policies.  These  results  are  presented  in  this  chapter,  The  theoretical 
work  has  been  directed  towards  establishing  the  connection  between  tho  application 
of  information  theory  to  surveillance  and  the  application  of  information  theory  to 
statistical  hypothesis  testing.  The  latter  applications  have  an  extensivo  literature. 
The  results  of  this  theoretical  work  and  review  are  rather  technical  and  are 
presented  in  Appendix  B. 
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Fi’om  both  perspectives,  numerical  and  theoretical,  we  feel  that  the  definition 
of  information  used  in  information  theory  has  the  promise  of  providing  a useful 
measure  of  effectiveness  for  judging  the  utility  of  alternative  allocations  of  diverse 
ASW  resources  in  certain  kinds  of  surveillance  missions. 

The  first  section  presents  the  tactical  setting  and  basic  assumptions  under- 
lying the  numerical  analysis.  The  surveillance  policies  considered  are  described 
in  the  second  section,  and  the  numerical  results  are  given  in  the  third  section. 

The  fourth  section  presents  conclusions  and  provides  a brief  review  of  related 
analyses  which  have  appeared  In  the  operations  research  literature.  The  related 
statistical  literature  Is  discussed  in  Appendix  B. 


Tactical  Setting  and  Basic  Assumr^ons 

We  shall  assume  that  there  are  J search  cells,  one  of  which  contains  the 
target.  The  target  may  move  from  cell  to  cell  in  the  course  of  the  search,  and 
target  motion  is  modeled  as  a Markov  process  as  described  below.  The 
surveillance  procedure  consists  of  assigning  ASW  search  effort  to  a selected 
grid  cell  and  then  estimating  the  target's  location  (designating  the  cell  containing 
the  target)  based  upon  the  search  results.  This  is  similar  to  the  "whereabouts" 
searches  discussed  in  reference  [e  ]. 

The  surveillance  operation  Is  carried  out  sequentially  in  stages  where  each 
stage  consists  of  assigning  search  effort  to  a single  cell,  evaluating  the  search 
results,  and  then  estimating  the  target’s  location.  Changes  in  target  position 
only  take  place  between  stage  b. 

As  an  example  of  a potential  application,  consider  a VP  operation  where  each 
day  one  or  more  flights  are  sent  to  an  area  specified  for  that  day.  At  the  end  of 
the  day,  the  search  results  are  evaluated,  the  area  for  the  next  day’s  flight  is 
determined,  and  the  best  estimate  of  the  target’s  location  (specified  by  a grid 
cell)  is  passed  to  the  operational  commander. 

Sensor-response  assumptions.  If  a sensor  response  is  obtained  in  a cell 
searched,  this  does  not  necessarily  mean  that  the  target  is  located  in  that  coll. 
Because  of  the  possibility  of  false  responses,  one  never  knows  with  certainty 
that  the  target  has  beer'  detected. 

It  Is  assumed  that  the  performance  of  the  ASW  search  system  is  idealized 
in  terms  of  a J x J response  array, 

R » (R(iJ)), 
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where  R(i,  ])  is  the  probability  that  an  increment  of  search  effort  applied  to  the 
jth  cell  will  result  in  a response,  given  that  the  target  is  located  in  the  i*h  cell. 
Here,  as  usual,  i is  the  row  index  and  j is  the  column  index.  As  in  Chapter  II, 
a response  is  a decision,  based  upon  the  available  information,  in  favor  of  the 
hypothesis  that  the  target  is  present  as  opposed  to  the  alternative  hypothesis 
that  the  target  is  not  present, 

At  the  end  of  each  stage,  a cell  is  selected  to  contain  the  target.  For  all 
policies  examined  in  this  report,  the  ceil  selected  1b  the  one  having  the  highest 
target  location  probability  based  upon  evaluation  of  the  search  results, 

The  problem  is  to  determine  a surveillance  polioy  (a  procedure  for  assigning 
search  effort  and  estimating  target  location)  which  will  maximize  effectiveness 
over  an  extended  period  of  time. 

Measure  of  effectiveness.  In  order  to  measure  surveil'uu  je  effectiveness, 
let  S(k)  denote  the  probability  of  correctly  selecting  the  cell  containing  the  target 
at  the  end  of  the  k™  stage.  The  function  S is  oalled  the  "success  function.  " A 
surveillance  policy  which  maximizes  S(k)  is  called  a "k  •optimal"  surveillance 
policy  and  a surveillance  policy  which  maximizes  S(k)  for  all  k > 1 is  referred  to 
as  a "uniformly  optimal"  surveillance  policy. 

A success  occurs  if  the  correct  cell  is  selected,  although  tills  fact  can 
never  be  confirmed,  since  any  sensor  response  is  possibly  due  to  a non-target 
cause.  Confidence  in  the  specified  target  locations  can  only  be  obtained  by  an 
accumulation  of  evidence,  no  single  Item  of  which  is  decisive. 


Target  motion  assumptions.  Target  motion  is  assumed  to  be  a Markov 
process  described  by  an  initial  probability  row  vector  d and  a transition 
matrix  D.  The  resulting  target  motion  stochastic  process  may  or  may  not  bo 
a stationary  process,  depending  upon  whether  or  not  d is  the  stationary  vector 
for  the  process.  In  more  general  non-Markovian  situations,  the  methods  of 
Chapter  II  can  be  used  to  model  the  motion  of  the  target. 


For  this  illustration,  It  will  bo  assumed  that  D is  a circulant  matrix  (see, 
for  example,  page  51  of  reference  [s|)  having  the  form 


where  0 < <5  < 1.  The  process  with  transition  matrix  D given  by  equation  <111- 1) 
depends  upon  a single  parameter  <5  and  is  stationary  if  and  only  if  d is  the  uniform 
distribution  d(j)  ---  l/J  for  j - 1,  . . . , J. 

If  dk  denotes  the  target's  distribution  over  the  J cells  after  the  kth  transition, 
then  it  is  not  difficult  to  show  that  for  j - 1,  . . . , J,  and  k - 1,  2 


dk<J) 


1 

J 


+ <l-6)k 


Thus,  each  component  in  the  target  distribution  vector  converges  monotonlcally 
to  the  uniform  vector.  We  shall  aall  6 the  dispersion  constant.  Note  that  if 
(5=0,  then  the  target  is  motionless,  and  If  fi  * 1,  then  the  target  distribution 
disperses  to  the  uniform  distribution  in  one  step. 

Tho  object  of  the  tracking  policy  is  to  overcome  the  dispersive  effects  of 
random  target  motion  by  the  expenditure  of  search  effort. 

Formulation  in  terms  of  stochastic  control.  It  is  useful  to  look  at  this 
surveillance  problem  as  a problem  of  controlling  a Markov  process  (for  back- 
ground In  stochastic  control  see,  for  example,  reference  [ t ]). 

Considor  a dynamic  system  whose  state  1b  the  probability  vootor  P for  the 
target's  location.  We  are  particularly  Interested  in  this  vootor  at  tho  beginning 
of  oach  stage.  Except  for  the  first  stage  where  P Is  assumed  known,  P depends 
upon  random  sensor  observations,  and,  therefore,  P is  itself  a random  variable. 

In  fact,  tho  time  behavior  of  P is  Markovian  when  d,  D,  and  R are  assumed  known 
without  error.  For  three  colls  (J  - 3),  it  is  possiblo  to  visualize  P as  a point 
(P(l),  P(2))  in  the  plane  since  P(j)  = 1.  Figure  III-l  shows  the  state  space 
for  P based  upon  this  interpretation. 

Tho  object  of  tho  tracking  policy  Is  to  provide  information  which  will  permit 
one  to  correctly  select  tho  cell  containing  tho  target  at  tho  end  of  oach  stage. 

Since  tho  predetermined  selection  rule  is  to  pick  tho  cell  with  the  highest  posterior 
probability  as  determined  by  tho  search  results  obtained  during  tho  stage,  we  can 
consider  the  state  spaoo  of  P to  be  tho  union  of  three  disjoint  (except  for  boundaries) 
regions  labeled  1,  2,  and  3 in  Figure  III-l.  If  the  point  falls  In  rogion  J at  tho  end 
of  a stage,  then  the  J*h  coll  is  selected  as  the  cell  containing  tho  target. 

The  "control"  is  a decision  function  or  policy  which  depends  upon  P at  tho 
beginning  of  a stage  and  which  indicates  the  cell  to  be  searched  during  that  stage. 

If  the  targot  Is  actually  in  coll  J during  a stage,  then  tho  target  is  visualized  as 
occupying  tho  vertex  determined  by  P(j)  = 1.  Tho  purpose  of  the  control  is  to 
guide  the  point  P as  often  as  possible  into  the  sot  which  contains  the  target. 
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STATE  SPACE  DIAGRAM  FOR  TARGET  LOCATION  PROBABILITY  VECTOR 


lx  P falls  within  this  region, 
then  Cell  2 will  be  selected 
for  the  target's  location. 


Illustrative  sample  path  for  P 


If  P falls  within  this  region 
then  Cell  1 will  be  selected 
for  the  target's  location. 


If  P falls  within  this  region, 
then  Cell  3 will  be  selected 
for  the  target's  location, 


Let  Pg  denote  the  vector  P at  the  beginning  of  a stage,  and  let  Pa  denote 
P at  the  end  of  a stage.  The  target  is  assumed  to  move  only  between  stages, 
and  the  PB  for  a given  stage  Is  computed  by  Pb  - PA°  where  Pa  is  the  vector 
P at  the  end  of  the  previous  stage  and  D is  the  transition  matrix. 

For  the  response  matrix  R and  a deoision  to  Bearch  cell  m,  the  vector  Pa 
will  depend  upon  whether  or  not  a response  is  obtained.  If  a response  is  not 
obtained,  then 


PbOMI  “«<!.»»  „ , . , 

PA(J)  * — r—  ■■■ 11  - — , for  1 < J < J. 

^t=1PB(i.m)(l-R(i,m)) 


If  a response  is  obtained,  then 


PAO) 


PB<J>  R0.m> 


PBa.>«)  <1-R<l,m)) 


-,  for  1 < J < J. 


it  is  theoretically  possible  to  design  a control  which  will  maximize  probability 
of  success  S(k)  at  the  end  of  the  k4*1  stage.  This  would  be  the  generalization  of 
the  single-stage  look-ahead  policy  described  in  the  next  section.  In  fact,  by 
working  backwards  in  time,  this  k-optlmal  control  can  bo  found  by  dynamic 
programming,  although  the  solution  is  rather  complicated. 

Our  main  Interest,  however,  is  in  the  situation  where  all  stages  are 
important  and  where  it  is  not  natural  to  establish  u fixed  terminal  time.  In  order 
to  gain  insight  into  this  situation,  we  will  examine  the  behavior  of  four  decision 
policies  (i.e. , controls)}  these  are  described  in  the  next  section.  Two  of  the 
policies,  the  single-stage  optimal  look-ahead  policy  (Policy  I)  and  a control 
based  upon  maximizing  tho  information  content  of  the  posterior  distribution 
(Policy  II),  are  chosen  for  their  Intuitive  appeal.  The  other  two  policies,  a 
policy  based  on  searching  the  highest  probability  cell  (Policy  III)  and  a policy 
based  upon  searching  tho  cells  In  a regular  rotation,  arc  chosen  because  they 
are  simple  and  easy  to  compute  and  they  give  us  "bench  marks"  for  comparison. 

It  should  be  noted  that  the  highest  probability  cell  policy  has  been  mentioned 
as  optimal  in  a closely  related  scenario  examined  in  reference  [ u).  In  reference 
[w  ),  however,  the  search  stops  as  soon  as  the  first  response  is  obtained  and 
the  search  is  successful  if  and  only  if  tho  response  occurs  in  tho  cell  containing 
the  target.  Our  measure  of  effectiveness  is  quite  different. 
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Description  of  the  Surveillance  Policies 

This  section  describes  the  four  different  surveillance  policies  considered. 
These  are  the  optimal  single-stage  look-ahead  policy  (Polioy  I),  the  maximum 
Information-gain  polioy  (Policy  II),  the  highest  probability  cell  policy  (Policy  III), 
and  the  uniform  surveillance  policy  (Policy  IV).  They  are  described  Individually 
in  the  subsections  below. 

The  success  function  defined  in  the  preceding  section  is  taken  to  be  the 
measure  of  effectiveness  and  is  computed  by  monte-carlo  simulation.  For  the 
kth  search  stage  of  the  nth  monte-oarlo  replication,  let 


S(n,  k) 


If  the  target  ceil  is  correctly  specified 


(0  otherwise. 

For  N replications,  the  kth  stage  success  probability  Sfk)  is  estimated  by  the 
formula 

1 N A 

S(k)  - ~ E S(n,k). 

N n-1 


For  each  comparison,  an  Initial  target  locution  probability  distribution  d 
and  a transition  matrix  D are  specified  to  dosoribe  the  target's  movements  and  a 
response  matrix  R Is  specified  to  describe  the  search  environment  and  the  sensor 
system. 


[|  The  monte-oarlo  calculation  begins  by  drawing  a random  number  to  pick 

li  the  cell  for  the  target's  initial  location.  This  selection  Is  made  in  accordance 

with  the  Initial  target  location  probability  distribution  d. 

]] 

l*  The  search  policy  specifies  a search  coll  for  each  stage  based  upon  the 

current  before-search  target  location  probability  distribution  Pb»  then  the 
||  search  results  are  simulated  in  accordance  with  the  target's  actual  location  and 

• ‘ the  probabilities  given  by  the  response  array  R.  Next,  the  aftor-searoh  target 

location  probability  distribution  Pa  Is  determined  from  the  simulated  search 
1”  results,  and,  finally,  the  after-search  highest  probability  cell  (based  upon  Pa) 

• Is  selected  as  the  target  cell.  The  target  position  Is  then  updated  in  accordance 
with  the  target  motion  transition  matrix  D at  the  end  of  the  stage  and  a new 
estimate  of  the  before-searoh  probability  distribution  Pjj  in  obtained  by  computing 

*•  Pj3  ■-  PaD.  The  process  is  then  repeated. 


1 

1 ;i 
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The  optimal  single-stage  look-ahead  poliov  (Policy  II.  The  optimal  single- 
stage  look-ahead  policy  Is  to  search  In  the  cell  which,  based  upon  the  estimated 
vector  Pb>  maximizes  the  probability  of  correctly  selecting  the  target's  cell 
at  the  end  of  the  stage.  This  is  a generalization  of  the  optimal  whereabouts  plan 
formulated  in  reference  [ e ] for  searches  without  false  responses. 

More  precisely,  at  the  beginning  of  any  stage,  let  Pb(1)  denote  the  before- 
search probability  that  the  target  is  located  in  cell  1 and  let  & be  the  response 
matrix.  Let  i,  J ) be  the  conditional  after-search  probability  that  the  target 
is  located  in  the  ith  oell  given  that  the  Jth  cell  was  searched  and  result  r was 
obtained.  Here,  r = 1 indicates  a target-like  response  and  r - 0 indicates  no 
target-like  response.  Let  Q(r,  1,  j ) denote  the  probability  of  obtaining  search 
result  r given  that  the  target  is  in  cell  1 and  that  cell  j is  searched.  Then 


for  r « 1 


for  r « 0. 


The  probability  function  pa  is  determined  from  Pr  and  Q by  the  equation 


PA<r»l.  j)  = 


PB(i)  Q(r.U) 

-1 

2 PB(m)  Q(r,m,  J) 
m-1 


Let  X(r,  j)  denote  the  cell  selected  to  contain  the  target  given  that  cell  j was 
searched  and  result  r was  obtained.  In  view  of  the  selection  rule,  which  states  that 
the  cell  with  the  highest  target  location  probability  should  bo  chosen,  we  have 


PA<r,X(r,  j >,  j ) > PA(r.i.j)  forl<i<J. 


Let  B(k)  denote  the  before-search  probability  that  if  the  k**1  cell  is  searched, 
then  the  target  cell  will  be  correctly  selected  based  upon  the  search  results.  If 


!1  If  x » 1 

0 otherwise, 
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J 1 

B(j)  =22  PB(i)R(r,t,])e(i-X(r,j)) 
i=l  r=0 

- 2 PB(X(rJ»Q(r,X<r,  j),j) 

r=0 


= max{PB(i)  R(l,j)  : 1 < 1 < j}  + max{PB(t)  [l-R(i,J  )1  : 1 < i < J}  . 


The  optimal  single -stage  look-ahead  policy  is  to  search  In  coll  J * for  which 


B(J*)  > B(J)  for  1 < j < J. 


If  more  than  one  cell  qualifies,  then  from  the  qualifying  cells  the  oell  with 
the  highest  probability  according  to  Pq  1b  chosen.  If  more  than  one  oell  still 
qualifies,  then  the  search  cell  Is  selected  randomly  from  the  qualifying  cells 
according  to  a uniform  distribution. 

The  maximum  information-gain  policy  (Policy  ID.  The  maximum  Information- 
gain  policy  Is  to  search  in  the  cell  which  maximizes  the  expeoted  information 
content  (or  equivalently  minimizes  the  expected  entropy)  of  the  posterior  after- 
search target  location  probability  distribution. 

More  precisely,  let  PB(j)  and  pA (r,  1,  j ) be  defined  as  above,  and  let  the 
entropy  (Bee  reference  [ d ])  H(P)  of  any  probability  vector  P ow  *■  J coIIb  bo 
defined  by 


J 

H(P)  = - 2 P(J)  In  P(J). 


Intuitively  speaking,  as  the  entropy  of  a distribution  increases,  the  distribution 
flattens.  It  is  well  known  (see,  for  examole,  reference  [d  J)  that  maximum 
entropy'lQ  attained  by  the  uniform  distribution.  The  Information  content  of  a 
distribution  P is  defined  to  be  -H(P)  + C whore  C Is  some  fixed  constant.  We  are 
Interested  only  in  changes  in  information  and,  hence,  the  value  of  C is  not  important. 

Tho  expeoted  entropy  U(J)  of  the  posterior  target  location  distribution  given 
search  in  cell  j Is  given  by 

+ This  result  holds  for  probability  distributions  on  a finite  number  of  points 
but  not  for  probability  distributions  on  a countably  infinite  number  of  points. 

In  this  latter  case,  a uniform  probability  distribution  is  not  defined. 
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J 1 

U(j)  = 2 2 PBft)  Q(r,  I,  j ) H(pA(r,  • , J)) 

i-1  r-0 

J 1 

= - 2 2 PB(i)Q<r,l,j)Ln[pA(r,l,  j)J. 

t=l  r=0 


The  maximum  information-gain  policy  is  to  search  in  any  cell  j*  for  which 


U(j*)  < U(J)  for  1 < J < J. 


This  corresponds  to  a Lindlty  procedure  (see  reference  [ f ])  for  sequential 
experimental  design  as  discussed  In  Appendix  B,  If  more  than  one  cell  satisfies 
the  above  inequality,  then  the  search  cel1  is  selected  randomly  from  the  qualifying 
cells  according  to  a uniform  distribution, 

The  highest  probability  cell  policy  (Policy  III).  The  highest  probability  cell 
policy  is  to  search  in  the  coll  with  the  highest  before-search  probability.  That  is, 
if  at  the  beginning  of  a stage  PB(J)  i*  toe  before -near oh  probability  that  the  target 
is  located  in  cell  j,  then  the  highest  probability  cell  policy  is  to  search  In  any  cull 
j*  for  which 


PbO*)  > PB(J>  torl<J<J. 


If  more  than  one  coll  qualifies,  then  the  search  cell  is  selected  randomly  from 
the  qualifying  cells  according  to  a uniform  distribution. 


It  is  interesting  to  note  that  the  before-soaroh  target  location  probability 
distribution  is  identical  to  the  expectation  (with  respect  to  PB)  of  the  after-aearoh 
target  location  probability  distribution  regardless  of  the  cell  searched.  In  order 
to  show  thiB,  let  £(U  j / denote  the  expected  aftor-search  probability  that  the 


target  in  located  in  cell  1, 

given  that  cell  j 

J 

1 

6M)  - £ 

2 PB(n)  Q(r, 

n~l 

r-0 

J 

1 

= 2 

2 PB(n)  Q(r, 

n-1 

r-0 

1 

- 2 

Pb<‘)  Q<r. l*  J > 

r-0 

= PB(l). 

Pfl(i)  Q(r,l,J) 

£ PB(m)  Q(r,m.  j) 
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The  uniform  surveillance  policy  (Policy  IV).  The  uniform  surveillance 
policy  is  to  search  systematically  through  all  search  cells  in  a fixed  rotation. 

In  mathematical  notation,  the  jth  cell  is  searched  during  the  kth  stage  where 
J ■ 1 + (k-1)  {mod  J)  for  k = 1,  2,  . . . , and  J equal  to  the  number  of  search  cells. 
That  is,  one  searches  the  J cells  in  order  and  then  repeats  the  search  as  often 
as  required. 

If  the  target  does  not  move,  it  is  not  difficult  to  prove  that  the  success 
funotion  for  this  policy  will  converge  to  1 whenever  the  rows  of  the  response 
matrix  are  distinct  (the  usual  case). 


Numerical  Comparison  of  Surveillance  Policies 

This  seotlon  provides  a numerical  comparison  of  the  four  surveillance 
policies  described  in  the  preceding  section.  Five  surveillance  oases  are 
considered  corresponding  to  different  assumptions  about  d,  D,  and  R.  In  order 
to  reduce  complexity  and  make  it  easier  to  interpret  the  results,  the  search  grid 
Is  limited  to  three  cells  in  the  first  four  oases  and  nine  oells  in  the  fifth  case. 
Moving  targets  are  considered  only  in  the  first  caBe. 

Three  response  matrices  are  examined.  The  first  is 


and  is  used  in  Cases  I,  It,  and  III.  Recall  that  R(l,  j)  Is  the  probability  of 
obtaining  a response  from  a search  of  cell  j given  that  the  target  is  in  cell  i. 

This  particular  form  of  R is  chosen  in  order  to  simulate  a situation  where  search 
in  one  cell  (the  first)  produces  very  little  Information  gain.  In  this  cell,  the 
true-reaponae  and  false-response  probabilities  are  nearly  equal  (.8  and  .7, 
respectively). 

Three  Initial  target  location  probability  distributions  are  used  with  the 
response  matrix  of  equation  (III-2);  these  are  a uniform  distribution  (Case  I) 
given  by  d(l)  - .33,  d(2)  = .33,  and  d(3)  = .34,  a "highly"  non-uniform  distribution 
(Case  II)  given  by  d(l)  « . 75,  d(2)  = . 15,  and  d(3)  = . 10,  and  a "moderately"  non- 
uniform  distribution  (Case  III)  given  by  d(l)  = . 5,  d(2)  = .3,  and  d(3)  « ,2. 

The  second  response  matrix, 
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corresponds  physically  to  a situation  where  the  three  cells  are  arranged  in  a row 
and  where  the  response  probabilities  inorease  the  "oloser"  one  gets  to  the  target 
cell.  The  uniform  target  location  probability  distribution  is  used  with  the 
response  matrix  given  by  equation  (III-3)  in  Case  IV. 

The  third  response  matrix, 


is  considered  in  Cose  V and  has  some  features  in  common  with  the  response 
matrices  given  by  equations  (II1-2)  and  (HI-3).  It  is  similar  to  the  reoponse 
matrix  given  by  equation  (III— 2)  in  that  little  information  is  gained  from 
searching  certain  cells.  Equation  (III— 4)  represents  an  extreme  case  in  which 
no  information  is  gained  from  searching  cells  1 through  8.  It  is  similar  to  the 
response  matrix  given  by  equation  (III-3)  in  that  one  may  think  of  cells  1 through  9 
arranged  in  a row  with  the  probability  of  a response  from  a search  of  cell  9 
Increasing  with  decreasing  distance  from  the  target. 

The  Initial  target  location  probability  distribution  used  in  Case  V is 
d(l)  = .2  and  d(j)  = . I for  2 < J < J. 

The  numerical  results  are  given  in  the  following  subsections.  In  Cases  I 
through  IV,  400  monte-carlo  replications  uro  usod  for  each  curve,  and  in 
Case  V,  50  replications  are  used. 

Case  1(a)  — stationary  target.  As  mentioned  above,  there  are  throe  grid 
cells  and  the  initial  target  location  probability  distribution  is  uniform.  The 
response  matrix  is  given  by  equation  (III-2).  For  all  cells,  if  the  target  is  in 
the  cell  searched,  then  the  probability  of  response  is  . 8.  The  probability  of 
false  response  is  . 7 in  the  first  cell  and  the  probability  of  false  response  is  . 1 
in  the  second  and  third  cells.  These  false-responso  probabilities  do  not 
depend  upon  the  location  of  the  target. 


Figure  III— 2 provides  the  estimated  probability  of  response  curve  for 
Policies  I through  IV.  Notice  that  the  success  probability  for  the  maximum 
information-gaiu  policy  (Policy  II)  approaches  1 asymptotically.  In  the  earlier 
stages  of  search,  there  appears  to  be  no  statistically  significant  difference 
between  the  maximum  information-gain  policy  and  the  optimal  single-stage  look- 
ahead policy  (Policy  I).  Asymptotically,  however,  the  maximum  information- 
gain  policy  appears  to  have  a slight  advantage.  It  is  interesting  to  note  that  the 
uniform  surveillance  policy  (Policy  IV)  also  does  well  in  this  example.  The 
highest  probability  cell  policy  (Policy  III)  is  not  particularly  attractive  in  contrast 
to  the  other  policies. 

A problem  with  Policy  I occurs  when  the  target  location  probability  distribution 
becomes  very  concentrated.  When  this  happens,  the  after-search  estimate  of 
target  location  will  not  depend  upon  the  cell  searched  nor  upon  the  sea  :ch  results. 
Since  only  one  stage  is  considered,  all  search  cells  appear  equally  attractive  and 
the  before-search  highest  probability  oell  is  chosen.  This  can  lead  to  trouble  as 
we  shall  see  in  Case  V.  Table  III-l  displays  the  detail  :>  ne  of  the  monte-caiTo 
replications  for  Policy  I in  order  to  illustrate  this  point.  Notice  that  the  state  of 
indeterminacy  is  reached  at  the  third  search  stage. 

Case  1(b)  — moving  target.  The  same  initial  target  location  probability 
distribution  and  sensor  response  assumptions  are  made  as  in  Case  1(a).  In  the 
present  case,  however,  the  target  is  permitted  to  move  between  stages  according 
to  a Markov  process  specified  by  the  dispersive  transition  matrix  given  by 
equation  (LII-1).  The  comparison  is  limited  to  Policies  II  and  IV. 

Figure  III-3  show3  the  influence  of  target  motion  on  probability  of  success 
when  the  maximum  information-gain  polLcy  (Policy  U)  J.s  used.  Three  examples 
are  considered  corresponding  to  values  of  the  dispersion  constant,  6 - 0,  6 * .3, 
and  6 = 1.  The  example  where  6 0 corresponds  to  no  target  motion,  and, 

therefore,  this  curve  is  the  same  as  that  given  in  Figure  III— 2 for  Policy  II.  The 
example  where  6 - 1 corresponds  to  complete  dispersion  of  the  target  location 
probability  distribution  to  a uniform  distribution  at  each  stage  Hero,  even  if 
the  target's  position  1?  known  with  certainty  at  the  end  of  a stage,  the  ensuing 
motion  will  produce  a uniform  distribution  for  target  location  at  the  beginning  of 
the  next  stage. 

The  first  transition  of  the  Markov  process  is  made  at  the  end  of  the  first 
stage.  Therefore,  S(l)  is  identical  for  all  three  values  of  the  dispersion  constant. 

Figure  III— 4 shows  the  Influence  of  target  motion  on  probability  of  success 
when  the  uniform  surveillance  policy  is  used.  Results  arc  shown  for  the  same 
values  of  the  dispersion  constant  as  used  in  Figure  HI-3.  The  striking  irregularity 
of  the  curves  given  in  Figure  HI-4  Is  due  to  the  fact  that  the  uniform  surveillance 
policy  considered  here  is  a regular  rotation  of  search  through  the  three  cells  in 
the  grid.  As  noted  before,  Cell  1 is  a particularly  poor  cell  to  search  because  of 
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the  high  probability  of  false  response  in  this  cell.  The  dips  in  the  curves 
correspond  to  search  of  Cell  1 at  stages  3k+l  for  k - 0,  1,  2,  . . . . These  dips 
become  increasingly  pronounced  as  the  dispersion  constant  increases. 

As  in  Figure  III-3  the  curves  coincide  at  the  first  stage  since  there  has  been 
no  target  motion  up  to  this  point. 

It  should  be  noted  that  if  the  uniform  surveillance  policy  were  implemented 
by  selecting  the  search  cells  at  random  according  to  a uniform  distribution, 
then  the  curves  would  be  smoother  and  would  not  exhibit  the  periodic  dips. 

This  would  not,  however,  Improve  the  average  performance  of  the  policy. 

Case  II.  In  this  case  the  initial  target  location  probability  distribution  is 
non-uniform  with  the  highest  probability  assigned  to  the  first  cell,  i.e. , 
d(l)  = ,75,  d(2)  = .15,  and  d(3)  - . 10.  The  response  matrix  is  the  same  as  in 
Case  I. 

This  c&bo  is  presented  to  illustrate  a situation  where  searching  the  highest 
probability  cell  Is  clearly  not  a good  polloy.  Here,  the  first  oell  has  a very  high 
Initial  target  location  probability  but  very  little  is  learned  from  a search  of  this 
oell  because  of  the  high  false-response  probability. 

Figure  HJ-5  provides  the  estimated  probability  of  success  curves  for  Policies  I 
through  IV.  As  anticipated,  the  highest  probability  oell  policy  does  not  appear  to 
be  very  good.  In  fact,  it  is  only  slightly  better  than  the  trivial  policy  which  would 
select  the  target  cell  at  random  In  accordance  with  the  Initial  target  location 
probability  distribution  d and  reselect  the  same  coll  at  each  stage.  In  this  case 
the  trivial  policy  would  select  the  first  ceil  with  probability  . 75,  the  second  with 
probability  . 16,  and  the  third  with  probability  , 1. 

Once  again,  Policy  II  does  very  well,  and  as  one  might  expect,  Policy  I 
initially  does  better  than  Policy  [I  with  the  latter  catching  up  in  the  latter  stages. 
Once  again,  we  note  that  S for  Policy  IV  will  always  converge  to  one  when  the 
target  is  stationary  and  the  rows  of  the  response  matrix  are  distinct. 

Case  III.  As  in  Case  II,  it  is  assumed  that  the  initial  target  location 
probability  distribution  Is  non-uniform  and  given  by  d(l)  - .5,  d(2)  ~ . 3,  and 
d(3)  = .2.  Figure  III-6  provides  the  estimated  probability  of  success  curves  for 
Policies  I through  IV. 

Onco  again,  we  see  that  Policy  II,  the  maximum  information-gain  policy, 
appears  to  be  better  than  the  others. 

Case  IV.  In  this  case,  the  Initial  target  location  probability  distribution  is 
uniform,  i.e.,  d(l)  ^ .33,  d(2)  = .33,  and  d(3)  = .34,  and  the  response  matrix 
is  given  by  equation  (III— 3).  Here,  Colls  2 and  3 have  relatively  high  false- 
response  probabilities  in  contrast  to  the  previous  cases. 
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Figure  HI— 7 provides  the  probability  of  success  curves  for  Policies  I 
through  IV.  In  contrast  to  the  other  cases  considered  thus  far,  there  appears 
to  be  little  difference  in  the  probability  of  success  curves. 

Case  V.  The  purpose  of  this  case  is  to  examine  a situation  where  the 
maximum  information-gain  policy  (Policy  II)  is  definitely  superior  to  the  other 
plans  considered.  It  is  assumed  that  there  are  9 search  cells  and  that  the  Initial 
target  location  probability  distribution  d is  given  by  d(I)  - . 2 and  d(j)  •■■■  . 1 for 
j ~ 2, . . . , 9.  The  response  matrix  is  given  by  equation  (III-4).  The  results  are 
shown  in  Figure  III-8  for  Policies  I,  II,  and  IV.  Policy  III  is  extremely  poor  tu 
this  case  and  Is  not  shown.  It  continually  picks  the  first  cell  for  search  and  Its 
probability  of  success  function  remains  constant  with  a value  of  .2. 

According  to  this  response  matrix,  no  information  is  gained  by  searching  in 
Cells  1 through  8.  Since  the  uniform  surveillance  policy  (Policy  IV)  rotates 
search  through  all  cells,  a considerable  amount  of  time  will  be  lost  when  this 
plan  is  used.  Although  the  probability  of  success  function  for  Policy  IV  Is 
guaranteed  to  converge  to  1,  the  convergence  will  be  slow. 

The  optimal  single-stage  look-ahead  policy  will  also  have  difficulty  in  this 
case,  and,  in  fact,  the  probability  of  success  function  for  this  plan  does  not 
appear  to  converge  to  1.  The  reason  for  this  is  that  a state  of  indeterminacy  is 
eventually  reached  by  this  plan;  this  behavior  was  previously  noted  In  Case  I and 
illustrated  in  Table  III— 1.  When  the  before-search  target  location  probability 
distribution  Pb  Is  driven  to  the  state  where  the  after-search  selection  of  the 
target  cell  Is  the  same  regardless  of  which  cell  is  searched  or  what  response 
is  obtained,  then  the  coll  with  the  largest  before-search  probability  is  searched. 
However,  If  the  highest  probability  cell  Is  among  the  first  8,  then  no  Information 
Is  gained  by  the  search  and  the  after-soarch  probability  distribution  is  the  same 
as  the  before-search  probability  distribution.  This  means  that  the  same  cell 
will  be  searched  continually  In  succeeding  stages  and  progress  will  stop. 


Conclusions  and  Related  Operations  Research  Studies 

The  principal  conclusion  based  upon  the  numerical  examples  in  the  preceding 
section  is  that  the  maximum  information-gain  policy  (Policy  II)  appears  to  have 
very  desirable  characteristics  in  the  Idealized  surveillance  scenario  considered. 
Among  these  characteristics  (as  measured  by  the  success  function)  are  good 
initial  behavior  in  the  early  stages  and  good  asymptotic  behavior  in  the  later 
stages.  The  initial  behavior  is  measured  principally  by  comparison  with  the 
optimal  single-stage  look-ahead  policy  (Policy  I)  which  is  designed  to  be  good 
In  the  early  stages.  The  asymptotic  behavior  is  measured  principally  by 
comparison  with  the  uniform  surveillance  policy  (Policy  IV)  which,  for  a 
stationary  target,  is  guaranteed  to  converge  to  1 as  the  number  of  stages  increases 
Indefinitely  (provided  the  rows  of  R are  distinct). 
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We  conjecture  that  Policy  II  will  also  perform  well  In  cases  where  an 
incorrect  prior  distribution  d is  used,  i.  e. , that  Policy  II  is  robust  with  respect 
to  errors  in  d.  More  analysis  is  needed  to  verify  this  conjecture,  however. 

Policy  I appears  to  have  very  good  behavior  until  the  "saturation"  period  is 
reached  (see  Table  III— 1)  where  the  search  oell  oan  no  longer  be  uniquely  chosen 
by  picking  the  cell  which  maximizes  the  value  of  the  single-Btage  success 
probability  function  B.  The  ad  hoc  rule  of  choosing  the  highest  probability  cell 
at  this  point  produces  poor  asymptotic  performance  in  some  situations  (see 
Case  V)  and  a better  rule  should  be  employed.  Switching  to  randomized  uniform 
surveillance  at  the  saturation  point  would  be  an  improvement. 

Even  better,  perhaps,  would  be  the  extension  of  Policy  II  to  optimal  multi- 
stage look-ahead  policies.  In  this  regard,  the  theory  of  optimal  stoohastlo 
oontrol  might  offer  some  useful  insights. 

The  highest  probability  oell  policy  (Policy  III)  has  little  to  commend  it,  in 
general,  although  in  certain  speoial  cases  (e.g. , Case  IV)  it  may  produce 
satisfactory  results.  Its  poor  behavior,  in  general,  results  from  the  fact 
that  it  does  not  make  good  use  of  the  information  In  the  response  matrix. 

It  should  be  noted  that  none  of  tho  policies  considered  make  non-trlvlal  use  of  the 
information  in  the  Markov  transition  matrix  D,  It  seems  worthwhile  to  formulate 
and  evaluate  surveillance  plans  whiob  anticipate  target  motion  by  explicit 
consideration  of  D or,  more  generally,  consideration  of  whatever  stochastic 
mechanism  is  used  for  updating  target  location. 

In  the  results  presented  in  this  chapter,  It  has  been  assumed  *hat  the 
response  matrix  R Is  known  exactly,  Since  this  is  unlikely  to  be  true  in  practice, 
it  would  be  useful  to  relax  this  assumption  and  develop  policies  which  estimate  R 
and  target  location  simultaneously.  Tills  kind  of  adaptive  estimation  (see 
reference  [ o ])  is  Illustrated  In  tho  examples  in  Chapter  II  (without  optimization 
considerations,  however);  thoro  tho  single-glimpse  probability  of  detection  Pd 
is  treated  as  a random  variable  and  estimated  from  the  sensor  observations. 

In  view  of  the  good  performance  of  Policy  II  based  upon  maximizing  the 
expected  Information  gain  in  the  after-searoh  target  location  probability 
distribution,  it  is  somewhat  surprising  that  there  has  been  so  little  utilization 
of  information  theory  in  search  and  surveillance  problems  in  operations  research, 

In  fact,  tho  relevant  work  which  has  been  carried  out  and  reported  in  the  literature 
has  not  reflected  favorably  upon  tho  use  of  information  thoory  as  a tool  for  the 
analysis  of  these  problome. 

One  of  the  earliest  readily  accessible  papers  on  the  subject  (reference  [ b ) 
which  appeared  in  1961)  discusses  tho  connection  between  information  thoory  and 
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search  theory  and  concludes  with  the  statement  "Thus,  search  theory  should  be 
considered  in  connection  with  the  general  theory  of  statistical  decisions  rather 
than  with  information  theory. " This  statement  is  repeated  and  reaffirmed  with 
further  examples  in  reference  ( 1 ] which  appeared  in  1971. 

Both  references  [ h ] and  [ l ] examine  the  searoh  plans  which  maximize 
expected  information  gain.  As  discussed  below,  we  believe  this  is  the  correct 
approach  for  surveillance  but  not  for  search. 

In  reference  [ j ] which  appeared  in  1908,  it  is  stated  that  "Ever  since  the 
mid-nineteen-fortie s when  the  theories  of  information  and  of  searoh  became 
subjects  of  general  Interest,  attempts  have  been  made  to  apply  the  theory  of 
information  to  problems  of  searoh.  These  have  proved  disappointing;  neither 
the  formulas  nor  the  concepts  of  the  former  theory  have  found  a place  in 
olarlfylng  the  problems  of  the  latter. " 

Why  do  our  results  oonvey  the  opposite  impression?  The  answer,  we  believe, 
is  that  one  must  make  a clear  distinction  between  search,  where  the  objective  1s 
detection  of  the  target,  and  surveillance  .where  the  objective  is  knowledge  of  the 
target's  location.  The  concepts  of  information  theory  can  be  applied  to  both 
types  of  problems  but  in  different  ways. 

For  the  search  problem  (but  not  the  surveillance  problem  treated  In  this 
chapter),  we  believe  that  the  proper  way  to  draw  the  connection  between  information 
theory  and  search  theory  is  to  think  of  an  optimal  search  plan  as  one  which 
maximizes  (rather  than  minimizes)  the  entropy  of  the  posterior  target  location 
probability  distribution.  Viewed  this  way  (which  is  different  from  the  approach  of 
references  [ h ] and  [ 1 ]),  search  effort  is  used  to  extract  information  from  the 
distribution  rather  than  to  add  information  to  the  distribution. 

For  the  surveillance  problem,  however,  it  seems  appropriate  to  maximize 
the  information  gain  (minimize  entropy).  This  Is  especially  true  In  multi-stage 
soenarlos,  such  as  those  we  have  examined  In  this  chapter,  where  success  can 
be  achieved  without  detection  of  the  target  in  the  usual  sense.  The  soenarlos 
discussed  in  references  [ h ] and  [ 1 j are  limited  to  a single  stage  and  thus  the 
time  behavior  of  the  searoh  policies  is  not  apparent.  Another  point  of  difference 
between  our  analysis  and  those  of  references  ( h ],  { i ],  and  [ J ) is  that  the  latter 
do  not  consider  the  possibility  of  false  responses. 

Reference  [ v ] makes  use  of  information  theory  concepts  to  consider  the 
optimal  distribution  of  reconnaissance  effort  against  targets  in  the  presence  of 
decoys.  This  analysis  is  addressed  to  aerial  reconnaissance  against  land 
targets  and  is  closely  related  to  our  present  study.  There  is,  however,  an 
Important  point  of  difference  which  is  discussed  below. 
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Part  I of  reference  [ v | has  the  most  in  common  with  our  present  study.  . 1 

Part  II  considers  questions  related  to  enemy  hindrance  of  the  operations — a 
problem  which  we  do  not  consider.  ( 

The  basic  assumption  in  reference  [ v | is  that  there  are  J regions  (cells) 

and  that  in  the  i^  region  there  are  "Npossible  objects"  with  a priori  probabilities 

pi.  ....  P*  is  subject  to  the  constraint  . i 

1 N 


N J 
s p; 

n 1 ** 


1 for  j 1, . . . , N. 


(I  II -5) 


For  example,  in  one  important  case,  N 3 and  the  objects  are  a missile 
installation,  a decoy,  and  nothing  of  Interest. 

The  uncertainty  in  the  region  is  defined  in  reference  [ v ] to  be 


U (p 
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whoro  o^  is  some  positive  constant.  The  uncertainty  in  the  ontiro  map  is 
defined  to  bo 
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Iloforeneo  [ v | introduces  and  discusses  assumptions  pertaining  to  the  optimal 
allocation  of  reconnaissance  effort  in  order  to  minimize  the  uncertainty  given 
by  equation  (III-6)  subject  to  the  constraint  given  by  equation  (III-5), 


j 


Although  closely  related  to  our  problem,  a critical  difference  is  that  we  also 
make  use  of  the  knowledge  that  there  is  a single  target  prosont  in  the  area  of 
Interest.  This  is  an  extromoly  Important  piooo  of  information  for  it  allows 
sensor  responses  unc!  other  information  obtained  on  scene  to  bo  correlated  with 
target  motion  considerations. 


» 


in  tho  scheme  of  reference  [ v |,  the  case  where  it  is  known  that  there  is  a 
single  target  prosont  corresponds  to  N 2,  where  p j is  the  a priori  probability 


that  the  target  is  present  in  the  j®1  cell  and  p2  -1  - P2  ■ The  complication  arises 
from  the  additional  constraint  that 


This  constraint  is  necessary  and  important  in  our  tactical  setting  but, 
unfortunately,  it  transforms  the  separable  allocation  problem  considered  in 
reference  [ v ] Into  a non- separable  problem.  In  general,  non-separable  problems 
are  much  more  difficult  io  solve  than  separable  problems.  In  this  chapter,  we 
have  avoided  this  difficult  allocation  problem  by  restricting  the  Bearch  policy  to 
examination  of  a single  cell  at  each  stage.  Further  work  is  needed  to  devise 
efficient  computational  algorithms  for  obtaining  optimal  multi -cell  allocations 
which  maximize  the  expected  information  gain. 
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APPENDIX  A 

GENERALIZED  TREATMENT  OF  THE  INFORMATION  PROCESSING  SYSTEM 


} 

I 

This  appendix  presents  a generalized  treatment  of  the  mathematical  technique 
used  to  calculate  the  examples  in  Chapter  It.  The  general  mathematical  model  is 
presented  and  discussed  in  the  first  section.  The  second  section  specializes  the 
analysis  to  Markov  models.  Among  other  things,  this  specialization  leads  to  the 
development  of  recursive  computational  procedures.  The  next  section  discusses 
the  mathematical  model  of  Chapter  II  in  terms  of  the  more  general  formalism 
presented  in  this  appendix.  This  is  followed  by  two  sections  addressed,  respectively, 
to  reduction  of  state  space  dimensionality  and  numerical  computation. 


General  Mathematical  Model 


In  this  section  the  term  "model"  will  be  used  to  refer  to  the  N^-dimensional 
vector-valued  proenss  M - (m^, . . . , in™  ) wnose  components  comprise  all  of  the 
stoohastlo  processes  which  are  relevantTo  the  Information  processing  /situation  under 
consideration. 

The  probability  structure  for  tho  model  is  given  by  the  triple  (52,  c,  Pr)  where 
52  is  the  probability  space,  n is  a a -field*  of  subsets  of  52 , and  Pr  is  a probability 
distribution  (a  measure)  defined  on  a.  Thus, 

M(t)  s 52  -*■  S 

for  0 < t <<»  whero  the  "state  space"  S is  an  Njj-citmonslonal  Euclidean  space. 

The  a -field  of  Borcl  sets  of  S is  denoted  by  fi. 

The  model  Hi  consists  of  "observable"  and  "non-observable"  stochastic 
processes,  whioh  we  explain  In  turn. 


* All  random  variables  aro  <v -measurable  functions  (perhaps  vector-valued) 
defined  on  52.  When  explicit  dependence  on  < 52  is  shown,  w will  appear 
as  the  lust  argument  on  tho  right,  o.  g, , t,«)  is  abbreviated  as  $l(t). 
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An  observable  process  is  associated  with  a physical  phenomenon  whose 
characteristics  can  bo  expressed  in  quantitative  terms  and  can  be  assumed 
known  to  die  processing  system.  The  response  processes  of  acoustic  and  non- 
acoustic  sensors  are  specific  examples  of  observable  processes  which  are  of 
particular  interest  in  ASW.  In  simplest  terms,  sensor  responses  may  be  treated 
as  (0,  l)-valued  processes  whore  1 denotes  a response  and  0 denotes  no  response. 

The  specification  of  a 0 or  1 for  a sensor  at  any  particular  time  might  be  the 

result,  for  example,  of  a human  judgment  or  the  output  of  an  automatic  classification 

device. 

In  more  complicated  formulations,  the  observable  processes  might  correspond 
to  more  basic  quantities  such  ns  voltages  generated  by  the  sensor  hydrophones. 

Regardless  of  complexity,  however,  the  probability  structure  (i],tv,  Pr)  must 
be  established  explicitly  so  that,  among  other  things,  one  may  compute  the 
probabilities  associated  with  the  events  associated  with  the  mutual  interactions 
of  stochastic  processes  within  the  model. 

The  nonobservable  stochastic  processes  consist  both  of  "physical  processes" 
which  describe,  for  example,  target  radiated  noise,  position,  course,  and  speed, 
and  "non-physical  processes"  which  are  required  to  insui’e  that  the  model  is 
logically  self-oonsistent  nnd  possesses  certain  desirable  mathematical  properties. 
The  non-physioal  processes  lire  not  susceptible  to  physical  measurement  and 
verification  in  the  way  that  the  physloul  processes  are,  but  nevertheless  play  an 
essential  role  in  the  operation  of  the  information  processor. 

An  example  of  a commonly  used  non-physical  process  is  the  time-correlated 
stochastic  process  often  introduced  In  models  to  represent  random  fluctuations 
in  the  slgnal-to-no.s'.  ratio  of  acoustic  sensors  (see  reference  [p|).  This  time- 
correlatod  prooess  is  u logical  necessity  in  cusos  where  sensors  are  observed 
continuously  since  otherwise  unreasonable  results  are  obtained,  e.  g. , if  one 
assumes  the  random  variables  of  the  fluctuation  prooess  are  mutually  statistically 
independent  in  time  (white  noise). 

The  processes  of  the  model  M are  ordered  so  that  M (U,  V),  where  tho  non- 
observable processes  of  arc  collected  Into  one  Ny-dimenslonal  vector-valued 
process  0 (Uj, ....  u^),  and  tho  observable  processes  are  collected  Into  one 
Ny -dimensional  vector-valued  process  V (v^, . . . , Vjj  ). 

At  times  it  will  be  useful  to  write  S as  the  Cartesian  product  of  tho  Ny- 
dimensional  space  S1  and  tho  Ny -dimensional  Euclidean  space  S2,  i.  c. , to 
write  S S*  x S2.  A superscripted  symbol  for  a point  or  a sol  will  Indicate 
membership  in  S1  or  S2.  Points  and  sets  without  superscripts  will  generally 
be  associated  with  S.  For  example,  we  might  write  A* C S*,  A2  C S2,  and 
A A1  x A2  C S1  x S2. 


A -2 


It  is  assumed  that  the  observable  processes  are  monitored  at  discrete  time 
Instants  Ti  < T2  < • • • an<*  ^at  Ti  > °*  Continuous  observations  over  intervals 
oi'  time  are  not  considered  here,  out  with  Bomewhat  more  effort  they  could  be 
included  within  the  general  processing  framework  under  discussion. 

Let  e j.  be  the  sub  or  -field  of  rv  where  is  generated  by  the  collection  of 
observable  random  variables  { V ( r^) : t,  < t } . Events  in  correspond  to 
observations  which  have  occurred  at  or  before  time  t.  In  the  most  abstract  terms, 
we  are  interested  in  calculating  the  conditional  probabilities  (Pc  denotes  a conditional 
probability  operator) 


Pc{  a|b  } 


Pr{  AHb  } 
Pr{  B } 


whenever  A e a,  B e ot>  and  Pr{  B } >0.  More  generally,  if  pc{  A|  ot}  denotes 
the  conditional  probability  of  A £ cv  given  the  a -field  o^,  then  pQ{  A | et}  is  a et- 
measurable  function  and 

Pr{  AOC}  fc  po{  A | ot}  dPr, 

for  all  C c 0t.  This  formulation  pc{  A | ot  } of  conditional  probability  is  required, 
for  example,  in  cases  where  Pr{  B } -•  0 in  equation  (A-l). 

It  Ib  not  usually  necessary  to  oompute  Po{  A | B } (or  po{  A|  })for  all  A £ a. 
Substantial  reduction  of  computing  cost  and  computer  memory  can  be  achieved 
if  events  A are  restricted  to  smaller  at  -fields.  Eventually,  in  fact,  we  will 
restrict  attention  to  computation  of  Po{  A | B f for  A c 0^where  is  the  sub 
a -field  generated  by  the  non -observable  random  variable  U(t).  Notice  that  <p^ 
pertains  only  to  events  which  are  associated  with  U at  a single  time  t. 


Markov  Models 


In  order  to  develop  efficient  recursive  computational  procedures,  it  is  useful 
to  structure  & as  a Markov  process.  This  is  not  as  restrictive  as  it  may  seem, 
since  in  many  oases  what  uppears  to  be  a non-Markov  process  can  be  transformed 
into  a Markov  process  by  enlargement  of  the  state  space  and  by  other  devices. 

Let  ub  assume  that  is  Markovian  and  that  0(0,  • ) denotes  the  initial  probability 
measure  of  Si  induced  on  the  state  apace  S und  that  r denotes  the  Markov  transition 
function.  Let  t^  and  t„  denote  two  instants  of  time  (tj  < tg).  The  transition 
function  has  (he  following  properties  by  definition: 


I 


(1)  r(t^,X  ; t„,  • > Is  a probability  distribution  on  ^-measurable 
subsets  ofS  for  X c S. 

(2)  • s to i A)  is  a ^-measurable  function  on  S for  each 
moasurublc  subset  A of  S. 

(3)  r satisfies  the  Chapman -Kolmogorov  integral  equation, 
i.e. , if  tj  < t'  < t2,  then 

V(tvX  } t2,A)  /g  r(tltX  ; t'.dY)  r(t',  Y ; t2,A). 


Let  G(t,  •)  denote  the  probability  distribution  induced  on  S by  M(t)  conditioned 
upon  the  observations  which  have  taken  place  at  times  up  to  and  including  t,  i.e. , 
given  events  in  the  a -field  ot- 

m 2 2 

For  sets  in  ofc  of  the  form  { co  : V( 4,  co)  e Aj^ V(r^,u)  c A^  and 

0 - T1  * t2  < • • * < Tn  £ 1 } 1 the  Markov  structure  permits  expression  of  G(t,  B) 

explicitly  in  terms  of  the  functions  G(0,  •)  and  1\  Letting  A,  S1  x A ? for  .)  1 n, 

G(t,  B)  is  given  by  1 5 


Ja  JAl>  • ■ IAr)  G(M X0)  r(0,  x0i  Tr  dx2) . • . n Tt1  ml , x^  , dXri)  r (^ , x^  {t,  B) 

Js\-  ■ ■ 4 G<°-  dxo>  r<°>xoi  Tr  dxi)  • • • r<\,  .!•  Vi1  VdV 


, <A-2) 


where  the  denominator  is  assumed  to  be  non-zero  (this  is  always  true  in  our 
applications). 


In  most  situations  of  Interest,  equation  (A-2)  does  not  lend  itself  to  easy 
computation.  There  are  two  principal  problems.  The  first  is  that  the  state  space 
S has  very  high  dimension,  and  the  second  reason  is  that  the  functions  G(0,  •)  and 
T are  not  usually  conveniently  expressed  in  terms  of  mathematical  formulas. 

For  example,  the  transition  function  r might  be  expressed  in  terms  of  scenarios 
which  specify  the  stochastic  assumptions  for  target  behavior  in  the  mission  under 
consideration.  Ab  in  Chapter  II,  these  statements  are  most  directly  translated  into 
monte-oarlo  computer  programs,  rather  than  into  "analytical  formulas"  suitable 
for  substitution  in  equation  (A-2). 
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The  Model  of  Chapter  II 

In  order  to  motivate  the  Introduction  of  additional  mathematical  structure  for 
the  purpose  of  overcoming  the  two  problems  stated  above,  the  following  three 
subsections  will  describe,  in  the  formalism  of  this  appendix,  the  model  used  to 
calculate  the  examples  given  In  Chapter  11. 
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The  principal  objective  in  Chapter  II  is  to  compute  and  update  the  probability 
distributions  for  target  location  making  use  of  target-motion  scenarios  and  sensor 
response  data.  To  do  this,  the  components  of  the  observable  process 

(Vj, . . . , vN  ) are  defined  to  be  (0,  l)-stochastic  processes  which  describe 
the  time  historjK of  the  sensor  responses,  i.  e. , for  1 < n < Ny , 


1 if  the  n^1  sensor  is  responding  at  time  t,  and 


vn(t)  ^ 


0 otherwise. 


Target  motion  assumptions.  Let  Zj  and  Zg  denote  the  stochastic  processes 
for  target  latitude  and  longitude,  respectively.  In  the  model  used  in  the  examples, 
the  two-dimensional  target  location  stochastio  process  (Zp  Zq)  by  itself  is  not 
Markovian.  However,  by  addition  of  the  target  velocity  stochastic  process 
(Sp  5q)  and  the  scenario  random  variable  k,  the  augmented  five-dimensional 
2 - (z, , Z2,  Sp  82,  £)  become?  Markovian.  In  other  words,  given  Z(t),  the  future 
{ Z(t,)}t,  ^ is  statistically  Independent  of  the  past  { Z(t')  }t,  < 

The  transition  function  for  Z is  specified  in  termB  of  the  scenario  descriptions 
and  is  realized  by  monte-carlo  simulation  (see  the  discussion  of  program  TRANS 
in  Chapter  ri). 

SenBor-response  assumptions.  This  subsection  presents  the  sensor-response 
assumptions  for  the  model  of  Chapter  n. 

The  single-sensor,  slngle-gltmpse  probability  of  detection  and  false  response 
ure  assumed  in  Chapter  II  to  be  themselves  random  variables.  This  Is  done  to 
call  attention  to  the  fuct  that  in  most  ASW  situations  there  is  not  sufficient 
information  about  target  characteristics,  sensor  performance,  and  environmental 
conditions  (including  non-target  shipping)  to  provide  high  confidence  inputs  to  a 
detection  or  faise-response  calculation.  Using  the  methodology  presented  in 
Chapter  II,  one  begins  with  an  initial  probability  distribution  for  the  uncertain 
parameters  and  then  modifies  this  distribution  adaptively  by  utilizing  the  information 
obtained  from  the  sensor  responses.  In  the  language  of  systems  theory  (see 
reference  (c  j),  this  is  an  example  of  udaptlve  state  estimation  and  system  Identification. 

In  the  Illustrative  model,  the  sensor  response  (0,  l)-random  variables 
{ vn(Tk)  j for  1 < n < Ny  and  1 < k < r)  } are  assumed  to  be  mutually  statistically 
independent,  conditioned  upon  knowledge  of  the  target  location  and  the  detection  and 
faise-response  probability  random  variables  Pp  and  P^.  That  is*. 

* 1"  (no  tilde)  denotes  u specific  value  of  a random  variable  I . 
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= Vn<V  ^i<Tk>  for  1 < 1 < Nv  and  1 < k < rj  , 
for  1 < n < Nv  and  z1(Tf} ),  Z2(tt )),  PA,  PD 


= 52N>’  **  V-  (A-3* 

Thus,  knowledge  of  the  target  position  at  time  and  of  the  probabilities  of 
detection  and  false  response  make  the  current  observation  random  variables 
mutually  statistically  independent  and  independent  of  their  past  values. 

The  observable  process  V by  Itself  is  usually  not  Markovian  because,  among 
other  things,  the  sensor  responses  depend  upon  the  target's  position  which  is 
not  observable.  However,  when  the  unobservable  process  0 is  also  specified, 
the  model  M - (0,  V)  is  Markovian  even  though  V is  not. 

The  unobservable  process.  The  unobservable  process  for  the  model  of 
Chapter  II  is  defined  by 

U(t)  - (Z^t),  Z2(t),  3j(t),  S2(t),  k,  PD)  « (Z,  PD). 

The  process  IJ  (but  not  the  observable  process  V)  is  Markovian,  since,  as  we 
have  noted,  if  3l(t)  is  known,  then  the  statistical  properties  of  2(t')  are 
determined  for  all  t'  > t and  do  not  depend  upon  values  of  Z before  time  t. 

The  random  variables  k and  P^  are  not  time  dependent  and,  hence,  constitute 
trivial  Markov  processes. 


Reduction  of  State  Space  Dimensionality 


The  assumptions  in  this  section  are  made  in  order  to  reduce  state  space 
dimensionality.  Large  state  space  dimensional  tty  is  one  of  the  problems  previously 
mentioned  concerning  the  evaluation  of  equation  (A-2). 

Let  A1  be  a (i ^-measurable  subset  of  S*  lor  1 •••  1 and  2 (pl  is  the  Borel  field 
of  Sl).  Assume  that  0 Is  Markovian  and  that  G(0,  • ) and  r may  be  expressed  in 
the  special  form 


G<0,  A1  X A2)  - Ja1  L(0,  dxj)  H(XJ,  A2) 
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r<tr  Xtl,  t2,  A1  x A2)  /Al  A(tj,  X^,  t2,  dX^)  HCX^,  A2) 

where  L(0,  * ) and  A are,  respectively,  the  Initial  probability  distribution  and  the 
transition  function  for  the  unobservable  Markov  process  U.  The  value  H(X*,  A2) 
is  the  conditional  probability  that  V(t)  e A2  given  that  U(t)  = X1.  The  function 
H(Xl,  • ) is  assumed  to  be  a probability  distribution  on  02  for  every  X1  e S1  and 
H(* , A2)  is  assumed  to  be  a /31 -measurable  function  on  S1  for  each  /32 -measurable 
set  A2  C S2. 

Under  the  above  assumptions,  it  oan  be  shown  that  for  subsets  of  S of  the 
form  x S2,  where  B1  is  any  /31 -measurable  subset  of  S1,  the  function  0(t,  • ) 
defined  by  equation  (A-2)  can  be  rewritten 

G(t , B1  X S2)  « Kt  /gi  /B1  • • • /S1  [ n x H(x|,  A2)]  L(0,  dxj)  (A-4) 

A(0,  xj;  rv  dxj)  ■ • . A(tT)_1,  X*  ^ , dxj,)  , xjj  ;t,  B1) 

1 o 

for  t > 0,  whore  is  a normalizing  oonstant  defined  so  that  G(t,  S x S")  -■  1. 

The  significance  of  equation  (A-4)  is  that  probabilities  associated  with  the 
unobservable  process  & and  conditioned  upon  the  observable  process  if  may  be 
computed  by  integrations  over  the  state  space  of  S1  of  U rather  than  by  integrations 
over  the  state  space  S of  M as  required  by  equation  (A-2).  Among  other  things, 
this  decreases  the  amount  of  computer  memory  required  for  processing  the  data 
and  usually  con  be  expected  to  increase  computing  speed. 

Another  advantage  is  that  equation  (A-4.)  may  be  computed  recursively.  Let 
B*  c and  A?  e /32  for  1 < ) < rj , Further,  let  t > 0 and  - 0.  For  notatlonal 
convenience,  define  L(t,  B1)  G(t,  B1  x S2).  Then  one  can  show  that  for  j < t)  - i 


L<VrBl>  " *T,..  4 1 V A(TJ,XJl,TJtl,«j1+1)H(X,1tl.Ajfl).  (A-5, 


'j  + 1 

and  for  t > t (the  last  time  of  sensor  observation) 


L(t,  B1)  <c2  /si  L(t^  , dxjj)  A(rT},X1,t,B1).  (A-0) 

The  factor  xT.  appearing  in  equation  (A-5)  is  a normalizing  constant  defined  so 
that  L(t  , SftH. 

Equations  (A-5)  and  (A-8)  indicate  that,  in  a sense,  all  relevant  past  information 
about  target  motion  and  sensor  response  is  contained  in  the  most  recent  probability 
distribution  L(t,  • ) defined  on  S1. 
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Numerical  Computation 


In  applications  of  these  concepts  to  large-scale,  multi-sensor,  multi-platform 
operations,  the  conditional  probabilities  L(t,  B^)  have  been  computed  from 
equations  (A -5)  and  (A-61  by  using  monte-carlo  simulation  of  Cl  and  analytic 
determination  of  H(xK  A?).  This  seotion  briefly  outlines  these  computational 
procedures  In  terms  of  an  idealized  computer-processing  system. 


The  principal  advantages  of  the  computational  procedures  discussed  In  this 
seotion  and  employed  in  Chapter  II  are  as  follows; 


(1)  Realistic  target  motion  scenarios  and  descriptions  of  sensor 
behavior  may  be  used  when  formulating  the  processing  algorithms 
since  monte-carlo  simulation  reduces  the  need  for  introducing 
artificial  mathematical  assumptions  in  order  to  obtain  closed- 
form  solutions. 

(2)  A minimum  of  computer  core  memory  Is  required,  since  most 
data  are  stored  peripherally  and  processed  sequentially. 

(3)  In  many  cases,  certain  expressions  can  be  precomputed,  making 
use  of  existing  models  such  as  the  large-soale  ASW  simulation 
models  APAIR  and  APSURV.  Off-line  precomputation,  when 
feasible,  resists  ir.  rapid  processing,  which  is  particularly 
useful  in  real-time  tactical  applications. 

The  reader  should  refer  to  the  seotion  of  Chapter  II  entitled  "Information 
Processing  Procedures"  for  a more  detailed  dlsousslon  in  terms  of  the  illustrative 
model. 

All  information  concerning  past  target  movements  and  sensor  responses 
is  contained  in  two  external  files  UFILE  and  WGHT.  The  processing  oonsists  of 
reading  these  files  into  the  computer  in  parallel  and  updating  records  a pair  at 
a time,  one  from  each  file. 

Let  UFILE(t)  and  WGHT(t)  denote  the  contents  of  UFILE  and  WGHT,  respectively, 
at  time  t. 

The  file  UFILE(t)  contains  Nr  monte-carlo  samples  of  U(t),  and  the  file 
WGHT(t)  contains  Nr  "weights,  " each  pertaining  to  the  corresponding  record  of 
UFILE(t), 

Let  frn(t)  denote  the  n1*1  simulated  sample  function  of  U(t)  for  0 < t and  1 £ n < Nr, 
where  N , denotes  the  number  of  replications.  Since  0 is  Markovian,  knowledge  of 
Un(t)  statistically  determines  the  values  of  Un(t')  for  t*  > t without  reference  to  voIuob 
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of  Un(t')  for  t'  < t.  For  1 < a < Nr,  the  0th  weight  w^(t)  contained  in  WGHT(t)  is 

the  probability  n H(Un(r.),  A^)  based  upon  the  observed  sensor  responses, 
j - 1 J J 

START  denotes  the  computer  program  which  creates  UFILE(O)  and  WGHT(O). 

The  file  UFILE  (0)  is  created  by  generating  Nr  monte-carlo  samples  from  the 
initial  probability  measure  L(0,  • ) of  U. 

Since  by  definition  no  observations  are  associated  with  time  t 0,  each  record 
of  the  initial  file  WGHT(O)  contains  the  probability  1.  These  weights  indicate 
that  at  time  t - 0,  all  samples  of  UFILE  (0)  are  considered  equally  likely. 

Now  suppose  that  UFlLE(tj)  and  WGHT(t.)  associated  with  time  ^ are  to  be 
updated  to  time  tg.  As  above,  the  times  at  v/hich  observations  are  obtained  are 
denoted  . , . , , and  < t2  represents  the  time  of  the  moBt  recent  observation. 

Assume  tliat  < . , . < < tg.  The  first  step  is  to  update  UFILE  (tj)  to  time  tj. 

Let  TRANS  denote  a computer  program  which  updates  UFILE  by,  implementing 
the  transition  function  A.  Let  t'  ~ t.^  and  t'  The  first  record  iP(t')  of 

UFILE  (t')  is  read  Into  the  computer.  The  probability  distribution  A(t\  tr(t'),  t1,  •) 
is  then  sampled  by  monte-carlo  and  the  result  u\t')  becomes  the  first  record  of 
UFILE (t1).  This  procedure  is  repeated  for  eaoh  record  of  UFILE (t')  until  all 
records  have  been  updated  to  t*. 

The  next  step  is  to  update  the  file  WGHT(t'),  OBSERV  denotes  the  idealized 
computer  program  for  this  purpose.  The  inputs  to  OBSERV  include  the  newly 
created  file  UFILE (t')  and  the  file  WGHT(t');  as  with  UFILE,  updating  ts  carried 
out  one  record  at  a time.  A pair  of  values  un(T')  and  wn(t')  is  then  used  to  compute 
wn(T*)  using  the  formula 

wn(T')  wn(t')  HfuV'),  A^) 
which  follows  from  equation  (A-5). 

Once  files  UFILE  (t,  j)  and  WGHT(t,  are  completed  for  any  2 < j < t|,  files 
UFILE (t.)  and  WGHT(t,)  are  obtained  byVepeating  the  procedure  with  t'  t,  j and 
t'  -ty  This  continues  until  files  UFILE  (t^  ) and  WGHTlty  ) are  generated. 

If  tg  > T-  » then  the  final  update  consists  of  using  TRANS  to  operate  on  file 
UFILE (*r_)  in  order  to  generate  UFILE  (tg)  (see  equation  (A-6)>.  Since  no  new 
observations  occur  between  times  ana  t,  WGHT(t.2)  is  a replica  of  WGHT(  r^). 

Any  probability  associated  with  the  random  variable  U(tg)  conditioned  upon 
the  observed  process  H may  be  estimated  using  files  UFILE  (tg)  and  WC!HT(tg) 
and  the  formula 


A-9 


j 


i 


, 

j 


t 


2 i wn(t2) 

Pr{  U(t2)  c B1 1 )cAjforl<j<t|}«  - 

2 wn(t„) 
n=  1 2 


where  I^1)  ~ { m Un(t)  e B1 } . 
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APPENDIX  B 


FORMULATION  OF  THE  SEARCH  AND  SURVEILLANCE  PROBLEM  AS  A 
STATISTICAL  SEQUENTIAL  EXPERIMENTAL  DESIGN  PROBLEM 


by 


Thomas  L.  Corwin 


The  purpose  of  this  appendix  is  to  suggest  a theoretical  framework  in  which 
to  relate  information-theoretic  concepts  to  surveillance  in  a false  target  environ- 
ment. It  is  shown  that  the  problem  of  which  cell  to  search  at  each  stage  of  a 
surveillance  operation  may  be  viewed  as  a game  between  the  search  planner  and 
Nature  in  which  the  payoff  to  the  search  planner  is  measured  in  terms  of  the 
information  he  gains  about  the  true  state  of  Nature  for  a particular  choice  of  cell 
to  search.  Two  sequential  design  procedures  are  examined  in  this  context. 

In  the  first  section  the  surveillance  problem  is  stated  as  a problem  in  statis- 
tical hypothesis  testing,  in  the  second  section  some  fundamental  concepts  of  the 
theory  of  sequential  experimental  design  and  of  Information  theory  are  introduced. 
The  third  section  is  devoted  to  the  discussion  of  a general  measure  of  the  infor- 
mation content  of  an  experiment,  called  the  discriminator  function.  In  the  fourth 
section  it  is  shown  that  the  values  assumed  by  the  discriminator  function  may  be 
viewed  as  the  potential  payoffs  to  the  experimenter  in  the  play  of  a certain  type  of 
two-person  game.  Discussion  of  the  sequential  design  procedures  of  Chernoff  and 
Lindley  as  particular  examples  of  such  games  is  presented  In  this  section. 


Introduction 


Let  a region  in  N-dimensional  Euclidean  space  be  divided  into  J non-null 
measurable  sets  0 j (the  starch  cells)  for  1 < j < J. 

Let  Oj  denote  the  cell  containing  the  target.  In  this  appendix  we  assume 
that  the  target  is  stationary.  Let  the  parameter  space  be  given  by  { 1,  . . . , j) . 
Assume  also  the  existence  of  conditional  probabilities  R(j,k)  for  1 < j < J and 
l < k < J,  where  R(j,k)  is  the  probability  of  a response  upon  searching  in  9^ 
given  the  target  is  located  in  6 ^ . 

It  is  then  desired  to  test  the  following  simple  hypothesis  against  the  attending 
composite  alternatives: 
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1 


T 


Ifor  1 < j < J and  1 < i < <1 . 

In  the  ensuing  discussion  the  points  of  the  parameter  space  {l,  , . . , j}  will  often 
be  referred  to  ao  "states  of  Nature.  " 


Preliminaries 


Let  us  consider  a measurable  space  , X ) [reference  [ w),  p.  2],  l.e. , 

2.  is  a basic  set  of  elements  x c 2.  and  cr -algebra  of  subsets  of  %■  • We 
regard  ^X~  as  the  sample  space  of  an  experiment  and  X..  as  the  set  of  all  possible 
events  made  up  of  elements  of  the  sample  space.  Now  let  us  consider  the  con- 
struction of  J probability  spaces.  For  each  possible  state  of  Nature  J e { 1,  ....  J}  , 
let  S' j be  a probability  measure  defined  on  .X-  • We  will  assume  that  the  probability 
measures  are  mutually  absolutely  continuous  and  distinct.  Thus,  essentially  we 
are  considering  J probability  spaces  ( '~f_, , S'j),  j e { 1,  . . . , J}  . 

For  example,  in  the  application  mentioned  in  the  introduction,  the  sample 
space  for  an  experiment  in  which  all  of  the  cells  {Oy  : 1 < j < J}  are  simultaneously 
searched  over  and  in  which  J - 3,  is  given  by  '/L  - txj,  x2 x§}  , where 


Xi  = 

(NR,  NR,  NR) 

x2  “ 

(NR,  NR,  R) 

x3  “ 

(NR,  R,  NR) 

x4 

(NR,  R,  R) 

x5  ; 

(ft,  NR,  NR) 

x« 

(it,  nr,  R) 

x7  : 

(R,  R,  NR) 

XH  * 

<R,  R,  R). 

Here  an  ,rR"  in  the  k^h  entry  of  Xj  indicates  a reponse  in  cell  k and  an  "NR"  in  the 
kth  entry  of  xj  indicates  no  response  in  coll  k,  for  1 < j < 8. 

The  measures  { H j,  ...»  S j}  may  be  constructed  in  this  case  by  defining  the 
value  of  H i on  each  element  of  the  sample  space.  The  value  of  E ) on  a particular 
element  of  the  sample  space  is  simply  the  probability  that  the  particular  sequence 
ol  H's  and  NR's  will  be  observed  given  that  the  true  suilu  of  Nuiure  is  j,  i.e.  , T = j, 
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(or  given  that  the  target  is  in  cell  0 j).  Thus,  if  it  is  assumed  that  cells  are 
searched  independently,  £ j in  the  example  presented  above  is  defined  as  follows 


* 

! " 

?j(xl) 

|l-K(j,D|  H-R(j,2)|  |i-R(j,:i)| 

! 1 *• 

Fj(x2) 

1 1 - R(j»  1)1  1 1 - R<j,  2)1  R(j,  :i) 

!.  ti  . . 

; a 

■ !■ 

W 

f 1 - R(.i,  1)1  R(j,  2)  [1  - R(j,  3) | 

f i 

: i ■" 

W 

(1  “ R(J.  D!  H(J,2>  R(j,3) 

i-  A 

t I'- 

Fj(x5) 

R(J,  1)  [1-R(J,2)|  1 1 - U(J,  3)1 

ll  , . . 

I;  \ : 

iv  ; . 

w 

R(j,  1)  ! 1 - R(j,  2)1  R(J,  ;i) 

1 ' ' 

?j<x7> 

R(,i*  i)  R(),2>  [i-R(j,:i)| 

~j<X8> 

R(j,  1)  R(j,2)  R(J,  3)  ; 

for  1 < j < J. 


Let  us  now  consider  a set  of  M random  variables  defined  on  the  sample  space 

'X  » denoted  Y^ YM  . In  the  above  example  if  the  experimenter  is  allowed 

to  search  only  one  coll,  wo  could  define  a set  of  M J random  variables  on  the 
sample  as  follows: 


' m(xi) 


^ 1 if  the  in1*1  entry  in  \j  is  an  "It,  " 

j()  if  the  m1'1  entry  in  xj  is  an  "Nit,  " for  1 < i < li'\  1 < m < .1, 


Let  us  now  consider  the  following  problem:  Let  us  assume  tluit  for  a 
particular  experiment  there  are  J possible  states  of  nature.  For  each  j,  1 < j < J, 
there  exists  a probability  space  ( /_ , , ?'j)  and  on  the  sample  space  are 

defined  M random  variables,  V Y^j . Now  let  us  also  assume  that  available 

to  the  experimenter  are  N trials  in  which  he  may  observe  any  one  of  these  M 
random  variables  In  order  to  make  inference  about  which  one  of  the  J states  of 
Nature  is  the  true  one.  In  the  terminology  of  the  problem  slated  in  the  introduction, 
the  search  planner  has  available  to  him  N trials  in  each  of  which  he  may  search 
any  of  J cells  in  order  to  make  a determination  about  the  actual  location  of  the 
target.  In  this  ease  M ,J.  The  problem  then  is  to  determine  which  random 
variable  should  be  sampled  at  each  trial  In  order  to  optimize  his  ability  to  discern 
the  actual  state  of  Nature  at  the  end  of  N trials.  In  the  terminology  of  Chapter  HI, 
the  search  planner  is  interested  in  maximizing  S(N>. 


To  this  end,  let  us  discuss  the  likelihood  ratio  statistic.  I'or  purposes  of 

simplicity,  let  us  assume  that  for  1 < m < M the  random  variable  Ym  is  real 
valued.  Let  us  also  assume  that  the  probability  measure  of  Ym,  1 _<  m < M,  is 
absolutely  continuous  with  respect  to  some  fixed  measure  n defined  on  the  Borel 
field  of  the  real  numbers.  For  each  state  of  Nature  j c { 1,  ....  j}  , let  us 
denote  the  density  of  the  random  variable  Ym  by  wm>  j for  1 < m < M.  Then  the 
likelihood  ratio  statistic  for  testing  the  hypothesis  that  j is  the  true  state  of  Nature 
against  the  alternative  that  k is  the  true  state  of  Nature  is  given  by 


mN 


where 

(1) 


N 

E log 
n=l 

wmn,](Ymn) 

,Wmn*  k(Ymn) 

for  ills'!  and  1 1 

Ym  is  the  sampled  value  of  the  random  variable  Ymjj , 
the  n random  variable  sampled  at  the  n^1  step , and 


(ii)  wmn,  j(* ) ls  t}le  density  of  tho  random  variable  Ymjl 
under  the  hypothesis  that  j is  the  true  state  of  Nature. 

Thus,  at  the  nth  step,  the  increment  In  tho  likelihood  ratio  statistic  is  given  by 


A 

<5(j,  k,  mn,  Ymn) 


log 


Wmn,j(Ymn) 

wmn,k(Ymn> 


for  1 < j < J and  1 < k < J. 


A 

Intuitively,  if  j Is  the  true  state  of  Nature,  6(j,  k,  m^.  Ymn)  represents  the 
additional  ability,  obtained  through  sampling  Ymn  at  tho  n™  step,  to  discriminate 
between  the  hypotheses  j and  k.  Roughly  speaking,  at  the  n^h  step  one  would 
prefer  to  sample  the  random  variable  Ym*  which  maximizes  this  increment  on 
the  average. 


In  the  terminology  of  Kullback  and  Letbler,  reference  |k  J,  given  the  m 
random  variables  {Ym  : 1 < m < M}  from  which  to  choose  at  the  nth  step,  the 
expeiimenter  would  prefer  to  choose  that  experiment  which  maximizes  the 
Information  number  m)  defined  as  follows: 
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i 


1 


CO 

w,n<  j(x)  dp(x) 


for  1 < j < J,  1 £ k < J, 


1 < m < M. 


1 


(B-l) 


Implicit  in  this  definition  is  the  assumption  that  j Is  the  true  state  of  Nature. 


The  Discriminator  Function  , 

Since  at  the  step  of  any  sequential  scheme  the  experimenter,  in  general, 
does  not  know  the  true  state  of  Nature,  he  is  faced  with  the  problem  of  choosing 
one  of  the  random  variables  {Ym  : 1 < m < M}  to  maximize  his  ability  to 
discriminate  between  some  estimate  of  the  true  parameter  value  T and  the 

remaining  parameters.  Thus,  one  is  led  to  consider  maximization  of  analogs  of  . 

the  expected  increment  in  the  likelihood  ratio  statistic  or  the  Kullbaek-Liebler 
Information  number  presented  in  expression  (B-l).  The  analogs  to  be  considered 
here  have  the  following  form: 

(i)  Let  : 1 < i < J}  be  a set  of  real  numbers  such 
that  0 < Xj  < l for  1 < i < J and  2^  Xt  - 1. 

(ii)  For  each  me{l,  . . . , M}  , let  0(m,  • ) be  a ] 

density  with  respect  to  the  measure  p . j 

Then  define  the  "discriminator"  function  D as  follows:  ii 


ao  J rf>(m,  x) 

D(m,  0(  m,  •),  Xlf  ....  Xj)  J 2 Xilog  0(m,x)dp(x) 

J-l  Lwm,J<x>J 

for  1 < m < M.  (B-2) 


Many  of  the  Important  discriminator  functions  discussed  in  the  literature 
appear  as  special  cases  of  the  discriminator  D,  for  specific  choices  of  the 
function  0 and  the  real  numbers  X1#  . . . , Xj  . For  instance: 

(i)  IiCt  j and  k bo  two  elements  of  the  samp  e space 
{ 1,  . . . , J)  i then  define  the  function  q by 

i 
\ 

q(j,m,  x)  wmj(x)  for  < x < ®,  1 < m < M,  1 < j < J 


L 


(B-3) 


and  define 


for  I = k ) 

> for  1 < i < J and  1 < k < J. 
for  i^k  ) 

Then  the  number  D(m,  r|  (j,  m,  • ),  w(k,  1),  ....  w(k,  J))  i.8  the 
expected  Increment  in  the  llkelinood  ratio  statistic  for  testing 
the  hypothesis  that  j is  the  true  state  of  Nature  against  the 
alternative  that  k is  the  true  state  of  Nature  when  sampling  the 
random  variable  Ym  . If  in— 1 Is  the  mode  of  the  posterior 

probabilities  defined  on  the  parameter  space  (l j)  at 

the  (n-1)8*-  step  and  i^i  is  the  mode  of  the  posterior  prob- 
abilities defined  on  the  parameter  space  at  the  (n-l)8*  step 
restricted  to  the  set  {l,  . . . , j}  - {in-i)  * then  the  number 
D(m,  Ti(in_i,m,  * ),  ^(in-1, 1)»  w(in-l»  J»  is  a form  of 
the  discrimination  number  used  in  Chernoff’s  procedure  A 
(reference  [ x |)  to  be  discussed  later. 


(ii)  Let  us  modify  example  (i)  above  slightly  to  produce  a 

different  discrimination  function  D.  Instead  of  the  function  co 
defined  in  (13-4),  let  us  use  a real-valued  function  w defined 
on  the  sample  space  {l,  . . . , j}  , satisfying 

(a)  0<  w(I)<  1,  for  ic  fl,  . . J)  - {i*_t} 

(b>  wdj.i)  --  0 

(c)  sJ.jWd)  - 1 

(d)  L)(m,  r|  (ln-:i» m*  * )*  ^(D,  ....  w(-J)) 

inf  D(m,  ri(in-1,m,  •),  , \j), 

(^1 \j)  c A(in_i) 

where 

A (k) -■  {(Xj,  ....  Xj)  :0  < Xj  < 1 

J 

for  i t { 1,  . . . , J}  , 2 Xj  1,  and  X^  : 0)  . 


(B-4) 


(B-f») 
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and  define 


oj  (k,  i) 


1 for  i = k 

0 for  i 4 k 


for  1 < i < .J  and  1 < k < J. 


(B-4) 


Then  the  number  D(m,  r|(j,m,  •),  u>(k,  1),  w(k,  J))  is  the 
expected  increment  in  the  likelihood  ratio  statistic  for  testing 
the  hypothesis  that  j is  the  true  state  of  Nature  against  the 
alternative  that  k is  the  true  state  of  Nature  when  sampling  the 
random  variable  Ym . If  i-n-i  ia  the  mode  of  the  posterior 
probabilities  defined  on  the  parameter  space  {l,  j)  at 
the  (n-l)8*  step  and  i^-i  is  the  mode  of  the  posterior  prob- 
abilities defined  on  the  parameter  space  at  the  (n-l)st  3tep 
restricted  to  tho  sot  {l,  . . j}  - {in-i)  • then  the  number 
D(m,  r|  <in-l*  m*  * ).  w(in-l»l>»  ....  w(i„!i,  J))  is  a form  of 
the  discrimination  number  used  in  Chernoff's  procedure  A 
(reference  [ x ])  to  be  discussed  later. 

(il)  Let  us  modify  example  (1)  above  slightly  to  produce  a 

different  discrimination  function  D.  Instead  of  the  function  m 
defined  in  (B-4),  let  us  use  a real-valued  function  u>  defined 
on  the  sample  space  {l,  j} , satisfying 


(a) 

0 < w(i)  < 1,  for  i c { 1,  . . 

1 

1 

►“3 

(b) 

w(i*_x)  « 0 

(c) 

2^1  a>  (i)  - 1 

(d) 

I) (m |i  t] f * )•  k'  (1),  • • 

. , a>(>J)) 

inf  D(m,  r|(in_i»m* ')»  Ai.  \j>. 

(*! Aj)  c A(l*_i) 


whore 


A (k)  - {(Aj,  ....  Aj)  : 0 < Aj  < 1 

J 

for  i ( { i,  ....  J)  , s A,  1,  and  Ak  - 0} . 

l.  i 1 rv 


(B-5) 


B-0 


As  will  be  discussed  later,  the  discrimination  number 
D(m,  n(in-l»m*  • )»  w(in-l>  1)»  •••*  ce(i^*i,  J))  represents  the 
payoff  to  the  experimenter  in  a particular  game  outlined  in 
Chemoff  s procedure  A for  his  choice  of  random  variable 
at  the  n^h  step  when  Nature  may  choose  from  among  only  a 
certain  set  of  her  pure  strategies  (see  fourth  section);  while 
the  discriminator  D(m,  ti  (i^-i,  m,  • ),  u(l),  ...»  «(.J)) 
represents  the  payoff  to  the  experimenter  in  the  same  game 
for  his  choice  of  random  variable  Ym  when  Nature  is  allowed 
to  choose  from  among  a wider  class  of  her  mixed  strategies, 

(iii)  Let  {«n“*(l)  : 1 < 1 < j)  be  the  set  of  posterior  probabilities 
defined  on  the  parameter  space  {l,  . j}  at  the  (n-l)8t 
step.  Then  define  the  function  E as  follows: 


E(m,x> 


J 

2 (V 
1---1 


n-1 


(i)  w 


m, 


i(x> 


for  -as  < x < « and  1 < m < M. 


Then  for  1 < m < M,  the  number  D(m,  E(m,  *)»  an-^(l),  ...»  «n_1(J)) 

is  the  expected  docrease  In  the  entropy  of  the  posterior 

probabilities  on  the  parameter  space  at  the  n™  step  given  that 

the  experimenter  samples  random  variable  Ym.  Notice  hero 

that  the  choice  of  m to  optimize  D(m,  E(m,  •),  a,n“1(l),  ...»  a,n"*(J)) 

is  essentially  an  attempt  to  increase  the  average  expected 

power  to  discriminate  between  tho  current  estimate  of  tho 

density  of  Ym  given  by  2^  <vn"l(i)  wni  i(> ) and  the  densities 

of  Ym  under  the  assumption  that  each  of  the  parameters 

{l,  , . J)  repi-esonts  the  true  state  of  Nature. 


A number  of  simple  results  regarding  the  function  D may  be  proved  using  the 
results  of  Kullhack. 


THEOREM  B-l.  For  1 < m < M, 


I.)(m,  (p,  \ , Xj)  > 0. 


(0-6) 


Proof,  in  Theorem  3. 1,  p.  LI,  of  reference  [ k |,  KullbacK  shows  that  for 
1 < j < J and  1 £ m _<  M , 


/ 


00 

-ao 


0 (m,  x)  dp(x) 


> 0. 
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Ihus,  expression  (B-t>)  follows  from  the  non-negativity  of  the  elements  of 
: 1 < j < J}  » which  proves  the  theorem. 


Let  D(mj,  m2,  0,  Xj,  . . . , x„)  be  defined  by 


D(mi,  m2,  4>,  Xj,  Xj) 

j r 

CD  00  M 

U U * Io« 

)=T  1 


«Mmi.x1)  <(>( m2,x2) 


w 


'mltjfrl>wm2,j<*2> 


^<ml*xl)  (rn2*  x2>  dh(xj)  dp(x2) 


for  1 < < M and  1 < m2  < M, 


where 


(i)  0 < Xj  < 1 for  1 < ] < j and  sf  Xi  * 1 and 


(ii)  for  each  m,  0(m,  • ) is  a density  with  respect 


to  the  rneasur 


Then  let  us  prove  the  following  theorem. 

THEOREM  B-2.  For  1 < mj  < M and  1 < m2  < Mj , 


Dfmj.mg.^.Xj, . . ,,Xj)  --  D(mlt0,  Xj, . 4.,0j)  + D(m2,0,Xj, . , 0j). 


Proof.  We  have 


]_*  ni2,  0,  Xj,  , , . , Xj) 


,00  ,00  j f 0 (m  J , 


'Mmj.xj)  0(m2,x2) 


j<xl>  wm2,j(x2> 


c^fmj.xj)  0 (m2,  x2)  dp(xj)  dp(x2) 


“ / 


a J r*(m1,x1)1 
- jf,  0<mi-xi)  d"(V 


oo  J 0(m9#x<%) 

^ A,  Mg  

1 J Lwm2,j(x2>. 


-CO  jrJ 


0(m2,x2)  dp(x2) 


D(mj,  0,  Xj Xj)  tD(m2,  0,  Xj,  ...,  Xj). 


% 


n-H 


Thus,  Theorems  B-l  and  B-2  show  that  I),  as  defined  in  (B-2),  is  non-negative 
and  additive  in  the  sense  discussed  above.  Both  of  those  results  are,  of  course, 
consequences  of  the  nature  of  the  log  function  used  in  the  definitions  of  D and  D. 
Special  cases  of  Theorems  B-l  and  B-2  are  presented  in  a variety  of  sources 
such  as  Kullback  reference  [ k |,  Bindley  reference  [ f |,  DeUroot  reference  [y  ), 
and  Box  and  Hill  reference  [ z |. 


Formulation  as  a Two-Person,  Zero  -Sum  Game 

The  concepts  of  game  theory  were  first  introduced  into  the  study  of  sequential 
experimental  design  by  Chernoff  in  reference  f x ],  Here  wo  generalize  those 
notions  to  a wider  class  of  games  between  the  experimenter  and  Nature  to  include 
the  selection  procedure  described  by  Lindley  in  reference  ( f |. 

Game  formulation.  At  each  stage  of  the  experiment,  we  assume  that  the 
experimenter  is  interested  in  maximizing  a quantity  of  the  form  (B-2).  The 
number  of  points  in  the  parameter  space,  alternatively  referred  to  as  states  of 
Nature,  as  well  as  the  number  of  experiments  available  to  the  experimenter, 
is  finite,  and  wo  formulate  this  problem  as  a two-person,  zero-sum  game  as  follows. 

Let  us  assume  that  before  a given  trial  Jn  an  experiment,  the  experimonter 
must  decide  which  of  M random  variables  {Ym  : 1 < m < M}  ho  will  sample.  He 
also  believes  that  depending  upon  the  true  state  of  Nature  the  payoff  to  him  will 
vary  according'  to  his  choice  of  random  variable.  Now  let  us  say  that  the 
experimenter  lias  decided  that  for  1 < j < J and  1 ' m £ M,  if  the  true  state  of 
Nature  is  "j"  and  he  chooses  random  variable  Ym,  the  payoff  to  him  will  be 
p(J,m)  given  by 


& (m,  x)  dp(x), 


whore 

(1)  wm(  j(* ) is  the  density  with  respect  to  p of  the  random 

variable  Ym,  given  that  j is  the  true  state  of  Nature,  and 

(lit  for  each  m «=  { 1 , ....  M}  , </>  (m,  •)  Is  a density  with 
respect  to  the  measure  p . 

However,  lot  us  also  assume  that,  in  addition  to  choice  of  a particular  state 
of  Nature  j and  n particular  random  variable  Ym,  called  pure  strategies  for  each 
player,  the  players  may  choose  mixed  strategies.  A mixed  strategy  for  Nature 
is  a probability  function  A defined  on  { 1,  . . . , .1}  and  denoted  by  {Aj  : 1 < j < J) 
where  A)  represents  the  probability  with  which  Nature  chooses  parameter  j at  the 
step  in  question  for  1 ^ j < J.  A mixed  strategy  for  the  experimenter  is  a 


n-a 


I 


probability  function  y defined  on  {1 M)  and  denoted  by  {ym  ; 1 < in  < M)  , 

where  ym  represents  the  probability  with  which  the  experimenter  chooses  the 
random  variable  Ym  for  1 < m < M . However,  in  some  cases,  the  rules  of  the 
game  may  specify  that  either  Nature  or  the  experimenter  or  both  may  choose 
only  from  among  pure  strategies. 

Thus,  if  Nature  chooses  mixed  strategy  X , the  experimenter  would  obviously 
like  to  choose  the  random  variable  Ym  to  maximize  his  expected  payoff.  If  Nature 
chooses  mixed  strategy  X , then  the  expected  payoff  to  the  experimenter  for  his 
choice  of  random  variable  Ym  is  given  by 

J 

r(m)  = 2 Xjp(j,m) 
j=l  J 


- D(m,  0,  xlt  . . . , Xj)  for  1 < m < M. 


Thus,  if  the  experimenter  may  assume  that  he  knows  Nature's  strategy  at  any  step, 
his  best  strategy  is  to  choose  the  random  variable  Ym* , where  m+  maximizes 
T over  the  set  {l,  . . . , M}  . In  this  case,  the  maximum  payoff  to  the 
experimenter  called  the  value  of  this  game  is  given  by 


v - max  F(m) 

me  { 1,  . . . , M} 


max  D(m,  4>( m,  *),  Xi Xj). 

me{l M} 


If,  on  the  other  hand,  we  may  assume  that  Nature  chooses  a strategy  from 
the  class  of  mixed  strategies  G in  such  a way  as  to  minimize  the  maximum  payoff, 
then  the  value  of  the  game  to  the  experimenter  is  given  by 


v* 


max 

me  { 1, . . .,m} 


min 

(Xj.,  • • • i Xj)  e G 


J 

2^  Xj  p(i,  m) 


max 

me  { 1 , , . . , m } 


min  D(m,  <p( m,  • ),  X.,  ....  Xj). 
(Xi,  . . . , Xj)  < G 


Chernoff's  procedure  A.  As  above,  let  i^_^  be  the  mode  of  the  posterior 
distribution  of  the  parameters  at  step  n-1.  Then  in  reference  ( x ] Chernoff  has 
suggested  that  the  experimenter  choose  the  random  variable  Ym  at  the  n^  step  to 
maximize  the  function  defined  on  the  set  fl,  . . .,  M)  as  follows: 
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where 


r?(m) 


inf 

(Xp  . . - , Xj)  c A (in_  j_ ) 


2 yCi-j- 

j i 


m)  for  1 < rn  < M, 


(i)  the  function  I is  defined  in  (B-l)  and 

(ii)  ) is  defined  in  expression  (B-5)  above. 

Chernoff  has  shown  that  this  is  equivalent  to  a choice  on  the  part  of  the 
experimenter  of  a pure  strategy  to  maximize  his  payoff  in  a game  with  payoff 
function  p^  given  by 


P^Um)  /_m  log 


'vCi(x> 


w 


m 


,j(x) 


wm,  dp(x) 


for  1 < j < J and  1 < m < M, 


where  It  is  assumed  that  Nature  is  free  to  choose  a mixed  strategy  from  among 
all  mixed  strategies  giving  zero  weight  to  the  mode  of  the  posterior  distribution 
of  the  parameters  at  step  n-1,  and  the  experimonter  is  free  to  chooso  his  strategy 
only  from  among  his  pure  strategies.  Here  the  value  of  the  game  at  stop  n Is 
given  by 


m c 


max 

fl... 


inf 


m \ (Xlt 


— > Xj ) <. 


A)  I,  m). 


In  terms  of  the  discriminator  function  I)  defined  in  expression  (B-2),  Chernoff 
states  that  at  the  n^1  step  the  experimenter  should  sample  the  random  variable  Ym 
to  maximlzo  the  function  D(* , rj  (in— i*  • j )»  «-e  ( 1 ) , ...,  a'  (j))  over  the  set  { 1,  ....  M 
Thus,  the  value  of  the  game  may  also  be  written  as 


max 

rn  t.  { 1,  . . . , mj 


D(m,  n (i 


n-1’ 


),  -'(1), 


U.'(J)). 


As  has  been  staled  previously,  if  it  is  assumed  that  Nature  may  choose  only 
from  among  her  pure  strategies,  then  the  value  of  the  game  is  given  by 


m ( 


max 

{ 1 


m,' 


l) (rn , i]  1 » * * * l 9 ^ ^ r 


■Ci'J»- 
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It  is  a simple  matter  to  show  that 


Under  mild  restrictions,  Chcrnoff  has  shown  the  following  in  reference  [ x ]. 
(The  stopping  rule  of  procedure  A is  not  directly  relevant  to  our  discussion  and 
need  not  be  defined  hero, ) 

LEMMA  1*.  Let  the  stopping  rule  for  procedure  A be  disregarded.  Let  t 
be  the  smallest  integer  such  that  = T for  n > t . Then  there  exist  > 0 and 
b2  > 0 such  that 


Pr{r  > n}  < bjt*  ^2tl 


for  n > t , 


While  In  reference  [ x ] Chernoff  has  proved  that  procedure  A has  certain 
desirable  asymptotic  properties,  he  has  also  pointed  out  that  procedure  A may 
lead  to  "initial  bungling,"  since  "At  first  it  is  desirable  to  apply  experiments 
which  are  Informative  for  a broad  range  of  parameter  values.  Maximizing  the 
Kullback-Licbler  information  number  may  give  experiments  which  are  efficient 
only  when  0 is  close  to  the  estimated  value.  " 

Llndlev's  procedure.  Lindley,  reference  [ f ],  has  suggested  an  alternative 
approach  to  the  sequential-experimental  design  problem,  which,  as  he  points  out, 
applies  Shannon's  definition  of  the  information  content  of  a probability  distribution 
to  the  discussion  of  the  notion  of  information  in  an  experiment.  This  is  closely 
related  to  the  approach  taken  in  formulating  surveillance  Policy  II  in  Chapter  III 
as  wo  shall  see  below. 

Lindley  defines  the  amount  of  information  provided  by  an  experiment  as  the 
expected  change  in  the  entropy  of  the  posterior  probabilities  of  the  parameters 
as  a result  of  performing  the  experiment. 

For  example,  if  rvn“^  is  the  posterior  distribution  of  the  parameters  { 1,  . . .,  J) 
at  the  (n-l)st  step,  then  for  1 < m < M the  Shannon  information  content  in  the 
selection  of  random  variable  Ym  at  the  n^  step  is  s(m)  given  as  follows: 


* Implicit  in  the  proof  of  this  lemma  presented  in  reference  [ x ] is  the  fact 

that  for  each  m t { 1,  ....  m}  and  for  any  pair  of  distinct  parameter  values  i 
and  j,  there  exists  /?c  ^ such  that  p(/i)  4 0 and  / wm^  j(x)  dp x / wm>  j(x)  d p(x). 
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J.  • 


s(m)  / 


.O) 

-CO 


^ fm,)(x)  log[t  Jw]  ' jfj 


2^n-1(j)  wmJ(x) 


d iu(x), 


(B-7) 


where 

n wm  1(x) 

rmj(y)  - — — for  -«  < X < CO,  1 < j < J. 

2 wm  1(x)n-n"1(j) 
i- 1 ’ 


Using  Chernoff's  results  in  reference  [aa],  if  E is  defined  by 


E(m,  x) 


J 

2 

j-1 


«' n-1(j)  wm 


for  1 < m < M and  -co  < x < a> , 


then  it  is  easily  shown  that 


s(m)  - D(m,  E(m,  ■)»  o:n“^(l), 


a'n"*(J))»  for  1 < m < M. 


Lindloy  in  reference  [f  | suggests  that  the  experimenter  choose  the  random 
variable  Ym  at  the  n™  stop  to  maximize  the  function  s defined  above. 

The  following  discussion  relates  Lindley's  procedure  to  the  maximum 
information-gain  policy  formulated  in  Chaptor  III. 

In  the  contoxt  of  Chapter  til,  n n-1  indicates  the  current  target  location 
probability  distribution  Pjj  and  Ym  indicates  the  outcome  obtained  if  the  m**1  cell 
is  searched,  We  define  Ym  so  that  Ym  - 1 if  a responso  is  obtained  and  Ym  0 
if  no  response  is  obtained.  The  measure  p in  equation  (B-7)  assigns  weight  1 to 
each  of  the  sets  {o}  and  (l)  and  p(A-{o,  l})  - 0 for  any  measurable  set  A. 

The  quantity  wm  j(r)  is  the  probability  of  obtaining  an  outcome  r given  that 
cell  m is  searched  and  the  target  Is  in  cell  j.  Here,  r - 1 indicates  a response 
and  r 0 indicates  no  response.  In  the  notation  of  Chapter  III, 
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i 


( 

,1 

j 


1 


wmj(r)  = Q(r,j,m). 


The  probability  f^  j(r)  is  the  probability  that  the  target  is  in  cell  j given 
that  cell  m is  searched'and  response  r ip  obtained.  In  the  notation  of  Chapter  III, 


f£i,j(r)  = PA(r,j,m). 


If  entropy  of  a discrete  distribution  P on  J cells  is  denoted  H[P],  i.e. , 


J 

H[P]  * - S P(J)  In  PG), 

J=1 


and  if  as  in  Chapter  III 


then 


J 1 

U(m)  = 22  PB(J)  Q(r*  j*k)  H[PA(r,  *,k)], 
j=l  r=0 


s(m) 


z “ r^[PA(r»  * t m)]"|  2 

r=0  L J J-i 

fa 


PB0)  Q(r,  j,m) 
PBW  Q(r,J,m) 


= -U(m)  + H[Pb). 


(B-8) 


Equation  (B-8)  Indicates  that  finding  the  m which  maximizes  s(m)  (the  Llndley 
approach)  is  equivalent  to  finding  the  m which  minimizes  U(m)  (Chapter  III  approach). 

Like  the  Chjrnoff  procedure  A,  Lindley's  procedure  for  choosing  a random 
variable  to  samale  at  the  nt*1  step  may  also  be  considered  within  the  context  of 
game  theory.  Once  again,  we  think  of  Nature  and  the  experimenter  as  playing  a 
game  with  a particular  payoff  function.  In  this  case  the  payoff  function  is  slightly 
different  that  the  one  assumed  by  Chemoff  in  his  procedure  A.  For  a choice  by 
Nature  of  the  parameter  j and  a choice  by  the  experimenter  of  the  random  variable 
Ym,  Lindley  assumes  that  the  payoff  to  the  experimenter  at  the  n^  step  is  given  by 
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P2(j.m)  - /_,  log 


£ wm>i(x)  fvn_l(i) 
1 l 


w 


m,  jW 


^ *m,iW“n'1(i) 


dp(x) 


for  1 < j < J and  1 < m < M.  (B-9) 


Thus,  the  payoff  assumed  by  Lindley  in  his  game  with  Nature  is  significantly 
different  from  that  assumed  by  Chemoff.  Also  different  is  the  strategy  assumed 
for  Nature.  Lindley  assumes  that  for  her  strategy  at  the  n^  step,  Naturo  chooses 
the  mixed  strategy  cvn“^.  That  is,  Nature  chooses  parameter  j with  probability 
cv n— 1 (J)  at  the  n^1  step. 

Consequently,  assuming  that  Nature  plays  mixed  strategy  cvn_1  at  stop  n, 

if  the  experimenter  wishes  to  maximize  his  expected  payoff,  he  must  choose  the 

random  variable  y’m*  such  that  m*  maximizes  the  function  r?  defined  as  follows: 

<2 

r”  (rn)  = 2 on  1(j)  P2(J,  m) 
j-1 

s(m)  Dfm,  K,  <v ...,  rvn-1(J)),  for  1 < m<  M. 


Thus,  the  value  of  the  gamo  or  the  maximum  information  which  the  experimenter 
can  derive  from  a sample  at  the  n*'1  trial  is  given  by 


va  max  D(m,  E,  (vn-1(l), 

m c { 1, . . m) 


Large-sample  results  similar  to  the  ones  obtained  for  Chernoff's  procedure  A 
In  reference  [x  | have  not  yet  been  obtained  for  Lindley 's  procedure.  However, 

It  is  fairly  obvious  that  since  Lindley's  procedure  uses  all  the  information  about 
the  parameters  available  to  the  experimenter  at  every  stage,  It  will  not  be  subject 
to  "initial  bungling"  to  the  same  extent  us  Chernoff's  procedure  A.  However, 
conversely,  due  to  its  heavy  reliance  upon  all  information  regarding  the  parameters 
at  each  step  rather  than  only  the  most,  likely  as  in  procedure  A,  its  large-sample 
properties  may  not  be  as  dramatic  as  those  attending  Chornoff's  procedure. 


H-ir> 


